Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Widen L2 cache line to support burst access #2647

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

occheung
Copy link
Contributor

@occheung occheung commented Dec 24, 2024

ARTIQ Pull Request

Description of Changes

This patch simply configures the line size. See the PR in MiSoC that implements the gateware changes.

Note: one might be tempted to increase the line size further (e.g. to 128 bytes). Since MiSoC granularize the memory into chunks of 8-bits wide, and the L2 cache size is 128 KB, each chunk would be 1 B * 1024 = 8 Kb. Vivado will allocate each chunk with the smallest 18 Kb block RAM, hence wasting more than half of the block RAM resources. Fixed in MiSoC and Migen.

64 bytes line size would largely mitigate this issue, and still fully support the 64 bytes cache line in VexRiscv caches.

Test

Passes artiq.test on Kasli v2. Gateware meets timing. See the PR in MiSoC for performance tests.

Type of Changes

Type
✨ New feature

Steps (Choose relevant, delete irrelevant before submitting)

All Pull Requests

  • Use correct spelling and grammar.

Code Changes

Git Logistics

  • Split your contribution into logically separate changes (git rebase --interactive). Merge/squash/fixup commits that just fix or amend previous commits. Remove unintended changes & cleanup. See tutorial.
  • Write short & meaningful commit messages. Review each commit for messages (git show). Format:
    topic: description. < 50 characters total.
    
    Longer description. < 70 characters per line
    

Licensing

See copyright & licensing for more info.
ARTIQ files that do not contain a license header are copyrighted by M-Labs Limited and are licensed under LGPLv3+.

@occheung
Copy link
Contributor Author

occheung commented Jan 13, 2025

May have an impact on block RAM utilization on ARTIQ after undoing the manual FIFO level logic optimization. Tested using vivado 2024.2. RTL objects that appears in both scenarios are removed.

Without manual opt:

+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+
|Module Name                              | RTL Object                             | PORT A (Depth x Width) | W | R | PORT B (Depth x Width) | W | R | Ports driving FF | RAMB18 | RAMB36 | 
+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+
|top__GCB1                                | buffer_space_2_reg                     | 256 x 16(WRITE_FIRST)  | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB1                                | buffer_space_1_reg                     | 256 x 16(WRITE_FIRST)  | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB6                                | buffer_space_3_reg                     | 256 x 16(WRITE_FIRST)  | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB3                                | mem_grain0_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain1_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain2_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain3_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain4_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain5_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain6_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain7_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain0_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain1_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain2_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain3_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain4_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain5_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain6_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain7_1_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain0_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain1_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain2_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain3_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain4_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain5_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain6_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB3                                | mem_grain7_reg                         | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
|top__GCB0                                | buffer_space_reg                       | 256 x 16(WRITE_FIRST)  | W | R |                        |   |   | Port A           | 1      | 0      | 
+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+

With manual opt:

+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+
|Module Name                              | RTL Object                             | PORT A (Depth x Width) | W | R | PORT B (Depth x Width) | W | R | Ports driving FF | RAMB18 | RAMB36 | 
+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+
|top__GCB4                                | data_mem_grain37_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain38_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain39_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain40_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain41_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain42_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain43_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain44_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain45_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain46_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain47_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain48_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain49_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain50_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain51_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain52_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain53_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain54_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain55_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain56_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain57_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain58_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain59_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain60_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain61_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain62_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | data_mem_grain63_reg                   | 2 K x 8(WRITE_FIRST)   | W | R |                        |   |   | Port A           | 1      | 0      | 
|top__GCB4                                | mem_grain0_2_reg                       | 191 x 8(WRITE_FIRST)   |   | R | 191 x 8(WRITE_FIRST)   | W | R | Port A and B     | 1      | 0      | 
+-----------------------------------------+----------------------------------------+------------------------+---+---+------------------------+---+---+------------------+--------+--------+

Objects with higher BRAM utilization is replaced with those of lower utilization. Some variants may cause LUT over-utilization and not pass the vivado placer.

Tested variant JSON entries:

{
    "target": "kasli",
    "variant": "tester",
    "hw_rev": "v2.0",
    "drtio_role": "master",
    "peripherals": [
        {
            "type": "grabber",
            "ports": [10, 11]
        },          
        {
            "type": "urukul",
            "dds": "ad9910",
            "ports": [0, 1],
            "clk_sel": 2
        },
        {
            "type": "sampler",
            "ports": [2, 3]
        },
        {
            "type": "zotino",
            "ports": [4]
        },
        {
            "type": "dio",
            "ports": [5],
            "edge_counter": true,
            "bank_direction_low": "input",
            "bank_direction_high": "output"
        },
        {
            "type": "mirny",
            "ports": [6],
            "almazny": true
        },
        {
            "type": "fastino",
            "ports": [7]
        },
        {
            "type": "phaser",
            "ports": [8]
        },
        {
            "type": "shuttler",
            "ports": [9]
        }
    ]
}

@occheung
Copy link
Contributor Author

Cache can now be synthesized as block RAMs after m-labs/misoc#159.

@occheung occheung marked this pull request as ready for review January 16, 2025 05:04
@sbourdeauducq sbourdeauducq merged commit 2b48822 into m-labs:master Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants