ECE1373 HDL

From Bits

<-- ECE1373


Contents

[edit] Performance

Revision Quartus Ver. Device Fmax (MHz) LUTs/ALUTs FFs Memory Usage Comments
r296 8.0 2C70 -- 71,149 49,013 368 M4Ks Synthesis only (250 M4Ks available)
r296 8.0 3SL150 47.35 55,921 52,909 304 M9Ks (1844 Memory ALUTs) Critical path: TG qout -> TG dequeue
r296 8.1 3SL150 45.76 55,725 49,083 355 M9Ks Disabled packing to LUTRAM
r303 8.1 3SL150 47.95 61,004 53,417 355 M9Ks Removed encoder in Bus1SourcePart and simpler ready signals for bus 1
r305 8.1 3SL150 55.74 61,571 53,428 355 M9Ks Decoded MUX, and removed encoder in Bus1DestPart
r327 8.1 3SL150 53.68 66,123 55,979 364 M9Ks Fixed wrreq bug in Bus1DestPart (r326), wiring bug in top.v and dequeue bug in Bus0SourcePart
r362 8.1 3SL340 54.40 73,821 62,871 396 M9Ks PQ drop tail, max2 fix, addressed Control unit
r296 9.0 3SL340 42.14 66,789 105,097 16 M9Ks (41,268 Memory ALUTs -- why??) Critical path: same

Note: Fmax number is for slow corner 1.1V at 85C

[edit] Interface to PC

We use the DE3 ports package to implement the interface between the control PC and the simulator on the FPGA.

[edit] List of Ports

  • control [39:0] (I): Control signals to the simulator (8-bit address, 32-bit data)
  • state [63:0] (O): State bits from the simulator used for debugging purposes
  • config [31:0] (I): Configuration words, with handshaking.
  • data [31:0] (O): Data words from the simulator, with handshaking

[edit] Control Port Format

0x00 [26] 31..6 5 4 3 2 1 0
Legacy Unused error-enable timer-enable loopback sim_enable data_request reset
0x01 [32] 31..0
Timer Timer initial value
0x02 - 0xFF unused

[edit] State Port Format

7 6 5 4 3 [3] 2..0
reset deserializer reset loopback enable error control-unit state
15 14 13 12 11 10 9 8
timer_enable timer_stop error_enable error_stop config_enable config_req data_req data_ready
[16] 31..16
Simulation time
[32] 63..32
Current timer value

[edit] Packet Format

[16] 63..48 [8] 47..40 [12] 39..28 [8] 27..20 [16] 19..4 [4] 19..0
Timestamp Destination Size Source Injection time Unused

[edit] Address Format

[1] 7 [2] 6..5 [5] 4..0
sBus ID Dest Partition ID Node ID

Note: The address 0x00 is treated as invalid, as a result, nodes on bus 0 part 0 starts at ID 1. This means that this partition can have a max of 31 nodes instead of 32. Since we have fewer Router nodes than TG/PQ nodes, and Router nodes are on sBus 0 and dBus 1, using sBus ID as bit 1 allows us to have the full 32 nodes for the TG/PQ dest partition 0.

[edit] Global Signals

  • sim_time_tick: Pulse for 1 clock cycle when sim_time changes

[edit] Statistics

Counters are shifted out of the design 8 bits at a time, little endian.

Signal
stats_in[7:0] Input of shift chain
stats_out[7:0] Output of shift chain
stats_shift Shift by 8 bits at next clock when asserted.
Unit Counter
PartitionPQ 0-3
TrafficGenDiv 0-7
TrafficGenDiv stats_received[32] Number of packets received by TG
TrafficGenDiv stats_injected[32] Number of packets injected by TG
TG->PacketQueueDivCore qin packets_dropped[32] Number of packets dropped by PQ
TG->PacketQueueDivCore qout packets_dropped[32] Number of packets dropped by PQ
PacketQueueDivCore 8-15 packets_dropped[32] Each packet queue has one counter
TrafficGenDiv 16-23 Same as TrafficGenDiv above.
PacketQueueDivCore 24-31 Same as PacketQueueDivCore above

A table is a poor way to express the recursive chaining of the stats[7:0] chain... Basically, when shifting out, the least significant 8 bits of each counter appear on stats_out first. qout's packets_dropped appears before qin's packets_dropped, then stats_injected, then stats_received within each TG. Within each PPQ, PQ 31-24 appears first, then TG 23-16, then PQ 15-8, then TG 7-0. PartitionPQ3 appears before PartitionPQ0.

[edit] Simulation Node

TrafficGen, Router, and PacketQueue nodes have a standard interface.

[edit] Input

  • clock, reset
  • enable
  • config_in_valid
  • config_in [7:0]: Configuration input channel. Valid when config_in_valid is 1
  • sim_time [15:0]: Global simulation time counter
  • sim_time_tick: A pulse indicates that sim_time has incremented
  • dequeue: dequeue the head packet
  • nexthop_in [7:0]: Broadcasted on the destination bus to specify which node should accept the incoming packet
  • packet_in [63:0]: Valid for this node when nexthop_in = addr2
  • ready_in: Valid data on nexthop_in and packet_in.

[edit] Output

  • config_out_valid
  • config_out [7:0]: Configuration shifter out
  • error
  • ready [1:0]: Indicate if a packet is ready for timestep T (bit 0) or T+1 (bit 1)
  • packet_out [63:0]: Valid when either ready0 or ready1 is 1
  • nexthop_out [7:0]: Next hop for this packet
  • timestamp_out [1:0]: Bits [2:1] of packet_out timestamp - (T-6)
  • packet_ack: The incoming packet is accepted

[edit] By Function

Function Signals
Global in: clock, reset, enable, sim_time, sim_time_tick
Configuration in: config_in_valid, config_in
out: config_out_valid, config_out
Status out: error
Destination Bus in: nexthop_in, packet_in, ready_in
out: packet_ack
Interconnect SourcePart in: dequeue
out: ready (or packet_en), packet_out, nexthop_out, timestamp_out

[edit] Configurable Fields

Configuration fields are set by shifting in 8-bit words one at a time through the config shifter. Each node's config_out_valid and config_out outputs should be registered so that when components are cascaded in a daisy chain we don't end up with a long unregistered path.

[edit] TrafficGen
  • addr2 [7:0]
  • size [15:0]: Size of the packets that this TG will generate. Only the lower 12 bits are used.
  • threshold [31:0]: If the RNG output is lower than threshold, a new packet is generated
    • If the user wants packet interval = N timesteps, then threshold = 0xFFFFFFFF / N
  • send_to [7:0]: Destination node that this TG will send to
[edit] Router
  • addr2 [7:0]
  • Routing table entries (512 per 2 routers)
[edit] PacketQueue
  • addr2 [7:0]
  • nexthop [7:0]
  • bandwidth [11:0]: Bandwidth of the this link (max 4KB per ms -> 32Mbps)
  • latency [11:0]: Latency of this link (max 2 s)

[edit] To Do

  • Add stats counters in the nodes
  • Add data path to shift out the stats counters
  • Make router drop unroutable packets

[edit] Source Partition Select

Selects the packet with the earliest timestamp from the source node. Generates the select signal for the source partition bus.

[edit] Input

  • clock, reset
  • timestamp_in [1:0] x param_num_nodes: 2-bit timestamp for comparison.

[edit] Output

  • packet_sel [N:0]: One-hot encoding of the node with the earliest packet

[edit] Source Partition Bus

[edit] Input

  • clock, reset
  • packet_in [63:0] x param_num_nodes: Incoming packets to the bus MUX
  • packet_sel [N:0]: One-hot select signal indicating which node is granted access to the bus

[edit] Output

  • packet_out [63:0]: Packet selected for this partition

[edit] Parameter

  • param_num_nodes: Number of nodes attached to this partition bus

[edit] Destination Partition Bus

[edit] Input

  • clock, reset
  • packet_in [63:0] x num_source_partitions: Incoming packets to the crossbar output

[edit] Output

To dest_in_queues...

[edit] Parameter

  • param_num_nodes: Number of nodes attached to this partition bus
Personal tools