Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighted Belief updates #49

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from
Draft

Weighted Belief updates #49

wants to merge 24 commits into from

Conversation

weihangzheng
Copy link
Collaborator

Weighted Belief Changes

Modifications Implemented:

  • Each agent is now equipped with:

    • A list of truthful weights.
    • A vector of scalar values (N), representing the probability that each agent considers another agent to be truthful, ranging from 0 to 1.
  • Recent Communications:

    • A dictionary that captures the set of the most recent communicated messages (2xN) exchanged between agents.
    • The structure is designed to incorporate data from previous iterations, facilitating the use of a sequential model (RNN, LSTM, Transformer).
  • Adaptive Model:

    • Processes a (2xN) collection of communicated messages (Recent Communications) from one agent to another and predicts a probability between 0 and 1. This prediction is used to revise the list of truthful weights.
    • While a single model could be utilized for all agents, given that each agent shares a similar noise level or belief vector with all other agents, the current approach separates the models to allow for future complexity enhancements.
    • The model's architecture can be expanded to process a (2xNxk) dataset, where k represents the number of previous iterations to consider. Alternatively, a state-based memory model, such as an RNN, LSTM, or Transformer, could be utilized to manage temporal data more effectively.
  • Belief Update Mechanism:

    • The previous method of stripping extreme values, denoted as strip D, has been replaced.
    • Now, each incoming message regarding a specific agent's position is weighted according to the truthful weights list, followed by a normalization step (weighted sum then normalize).
    • This adjustment results in a linear time complexity with respect to the number of agents (with a constant factor being the model’s evaluation time), as opposed to O(n log(n)) required previously for sorting and stripping D values.

Notes on Model Development

  • Intuition Behind the Model:
    • Agents with adversarial intentions tend to send more distorted data, which the model aims to identify and filter out.
    • There is an option to pre-train the model using either high-quality synthetic data or data gathered from actual gameplay, although this may conflict with the goal of minimizing the information provided to the agents.
    • An alternative approach is to iteratively evaluate and train the model based on ongoing gameplay data.
      • This approach necessitates the acquisition of retrospective, fairly accurate ground truth regarding whether each agent is truthful or adversarial.
    • To conserve computational resources, the model could be trained and assessed periodically rather than after every iteration.
      • The primary focus is to be placed on the exploration and refinement of the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant