- This is a Golang application that follow the instruction from this book to build a distributed log services.
Offset refers to the logical location of the entry in the file. E.g: if we have 2 entry entry0 & entry1, then entry0 would be at offset 0 and entry1 would be at offset1 Pos refers to the actual location of the entry in the file, representing by bytes E.g: if we have 2 entry entry0 and entry1, each consists of 8bytes. Then entry0 would be at pos=0 and entry1 would be at pos=8
- Consists of 4 components: Store and Index as the core, with Segment wrapped around, and Log as the outermost component
- Store is the most important component where it has a pointer to a file where the actual data(records) is saved.
- Index is used to speed up read operation, where it holds pairs of offset and position so we can jump straight to the actual record's location instead of having to iterate through the file. Store and Index are 2 files that go in pair, so if one is full, then both get replaced.
- Segment is an abstraction around Store and Index, each Segment only have 1 pair of Store and Index. Whenever we need to interact(create/append/read/delete/etc) with Store and Index, we can use the Segment so that we only have to interact with 1 entity instead of 2.
- Log manages a list of Segments, consist of a list of old segment and 1 active segment where data is actively being written to. When the Store or Index that the active segment manages is full (reach the pre-configured size), the Log will handle create a new Segment(with new Store and Index) and assign that new one as the active segment while the previous one got pushed into the list of old segment. Each Segment holds a varible baseOffset where the Log can use to determine which Segment to read from.
- A component that handles server to server discovery using Serf.
- Every instance will be managed by a Serf cluster and whenever there is a new insance joining or leaving the cluster, Serf ensure that every other instance in the cluster knows about this.
- Each Serf instace uses the startJoinAddrs to determine which cluster to join.
- Every instance in the cluster knows about the current state of the cluster (number of current working instance, which instance just joined/left, etc) at all times.
- The below flow demonstrate the flow used in unit test.
- We use Raft to create a leader/follower relationship between multiple servers.
- The leader will be the one actively listening and executing request
- All the follower will just replicate/store the request received from the leader without doing any actual work.
- When the leader is lost, all the follower will start a leader election process to vote a new leader.
- We create a DistributedLog components that wraps around our CommitLog and Raft.
- Exports an Agent that manages the different components and processes that make up the services (membership, log/distributed log, and server).
- We can just use the Agent to boot up the whole service instead of having to configuring each components.
- Each Agent will contain a multiplexer to distinguish between Raft request and gRPC request.
- Each Agent will contain only 1 combination of Serf membership, distributed log/Raft, and server. Multiple Agents create a cluster in which Serf and Raft show its effect.
- Understand gommap
- Understand enc ?
- Read more on Casbin
- Read more on Serf
- Read more on Raft
- Read more on CMUX (what is the difference between Raft and gRPC request, when does each get called?)
- Read more on GPRC's Resolver and Picker