Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hw,sram_ctrl,rtl] Add flop stage on RAM output #25952

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Razer6
Copy link
Member

@Razer6 Razer6 commented Jan 20, 2025

This PR adds support for an optional flop stage on the RAM macro output. The support for this stage is already present but this PR adds the necesary parameters to sram_ctrl to be able to configure it. For Darjeeling, this flop stage is enabled on all sram_ctrl isntances. For the other tops it remains disabled.

@Razer6 Razer6 requested review from vogelpi, a-will and nasahlpa and removed request for vogelpi January 20, 2025 18:47
@a-will
Copy link
Contributor

a-will commented Jan 20, 2025

Does this actually work as-is?

I'm surprised to see no changes to any control signals (which also are frequently not connected, suggesting that there is an implicit dependence on single-cycle read latency).

@Razer6 Razer6 force-pushed the sram-ctrl-flop-stage branch from 7a899a5 to e1d18e5 Compare January 21, 2025 06:25
@Razer6
Copy link
Member Author

Razer6 commented Jan 21, 2025

Hmm the logic for the output stage is implemented in prim_ram_1p_adv enabled via the existing parameter EnableOutputPipeline. That delays the output logic, i.e., rvalid, rdata, and error via:

  if (EnableOutputPipeline) begin : gen_regslice_output
    // Put the register slices between ECC decoding to output

    // If no ECC or parity is used, do not use prim_flop to allow synthesis
    // tool to optimize the registers.
    if (EnableECC || EnableParity) begin : gen_prim_rvalid_flop
      prim_flop #(
        .Width(MuBi4Width),
        .ResetValue(MuBi4Width'(MuBi4False))
      ) u_rvalid_flop (
        .clk_i,
        .rst_ni,
        .d_i(MuBi4Width'(rvalid_d)),
        .q_o({rvalid_q})
      );
    end else begin: gen_no_prim_rvalid_flop
      always_ff @(posedge clk_i or negedge rst_ni) begin
        if (!rst_ni) begin
          rvalid_q <= MuBi4False;
        end else begin
          rvalid_q <= rvalid_d;
        end
      end
    end

    always_ff @(posedge clk_i or negedge rst_ni) begin
      if (!rst_ni) begin
        rdata_q  <= '0;
        rerror_q <= '0;
      end else begin
        rdata_q  <= rdata_d;
        // tie to zero if the read data is not valid
        rerror_q <= rerror_d & {2{mubi4_test_true_loose(rvalid_d)}};
      end
    end
  end else begin : gen_dirconnect_output
  ...
  end

@a-will
Copy link
Contributor

a-will commented Jan 21, 2025

Ah, but that's only the generation. Don't forget to look at the consumers: Some of them ignore rvalid!

Single-cycle read latency is assumed in various parts of the code base, unfortunately. This one also appears to be special, with the conversion to / from mubi4.

@Razer6
Copy link
Member Author

Razer6 commented Jan 21, 2025

Dang it. That changed a little bit. Will factor rvalid in such that it works independent of the latency. Flops are needed. They are needed everywhere 🚀

@Razer6 Razer6 force-pushed the sram-ctrl-flop-stage branch from e1d18e5 to 9a83bcb Compare January 21, 2025 11:19
This PR adds support for an optional flop stage on the RAM
macro output. The support for this stage is already present
but this PR adds the necesary parameters to sram_ctrl to be
able to configure it. For Darjeeling, this flop stage is
enabled on all sram_ctrl isntances. For the other tops it
remains disabled.

Signed-off-by: Robert Schilling <rschilling@rivosinc.com>
@Razer6 Razer6 force-pushed the sram-ctrl-flop-stage branch from 9a83bcb to 830eeeb Compare January 21, 2025 12:43
@@ -374,14 +376,14 @@ module prim_ram_1p_scr import prim_ram_1p_pkg::*; #(
logic [Width-1:0] wdata_scr;
assign wdata_scr = (mubi4_test_true_loose(write_pending_q)) ? wdata_scr_q : wdata_scr_d;

mubi4_t rvalid_q;
logic rvalid;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this was a mubi as a countermeasure. @vogelpi the mubi rvalid shouldn't be dropped, should it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... but prim_ram_1p_adv already does mubi for control signals, doesn't it? Maybe this is fine.

Copy link
Member Author

@Razer6 Razer6 Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I thought about it. But that maybe needs a bit of discussion. In general the code uses a Mubi and then translates that back to a single bit to rvalid_o. I would argue it's the same effort faulting the combinational logic of rvalid_o and the combinational logic of rvalid, which is the boolean translated mubi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey both,
Let's try to keep this as MuBi and let's keep the flops below. If the new parameter is enabled, a second flop stage should be instantiated. Ideally the result of the output of the flop stage should be compared to the rvalid signal from the macro in in case of a mismatch we should factor this into alert_o. Also an alert should fire in case the flop stage doesn't hold a valid MuBi encoding. This is currently not done and I don't know why. @nasahlpa do you maybe remember? Did we try this and then ran into an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in a meeting and @nasahlpa brought up that the rvalid is buffered here to properly handle the address collision case (see addr_collision_q). This needs to be factored in as well when adding the flop stage. @Razer6 will look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants