Introduction

An open-source intelligence dataset focusing on people, their organizational memberships, and significant events related to them, it’s essential to ensure that the data is comprehensive, structured, and easily interpretable. Here’s a detailed breakdown of the key metadata and person data types you should consider recording:

Core Person Data:

  1. Person ID:

    • A unique identifier for each person.
  2. Full Name:

    • The person’s full legal name.
  3. Aliases/Nicknames:

    • Any known aliases or nicknames the person might use.
  4. Date of Birth:

    • The person’s date of birth.
  5. Nationality:

    • The person’s nationality or nationalities.

Metadata for Context and Details:

  1. Gender:

    • The person’s gender.
  2. Occupation/Role:

    • The person’s occupation or roles they have held over time.
  3. Biography:

    • A brief biography detailing important aspects of the person’s life and career.
  4. Photograph:

    • A link to or an embedded photograph of the person.
  5. Contact Information:

    • Any known contact details (e.g., email, phone number). This might be sensitive and used for internal reference only.

Additional Data for Connections and Activities:

  1. Addresses:

    • Known addresses (current and previous) including city, state, country, and specific locations if available.
  2. Affiliations:

    • A list of organizations the person is or has been affiliated with, linked to organization IDs.
  3. Related Events:

    • A list of events the person has been involved in, linked to event IDs.
  4. Criminal Records:

    • Details of any known criminal records, including charges, convictions, and sentences.
  5. Social Media Profiles:

    • Links to any known social media profiles.
  6. Education:

    • Educational background, including institutions attended and degrees earned.
  7. Known Associates:

    • Other people known to be associated with this person, linked to their person IDs.
  8. Keywords/Tags:

    • Keywords or tags that help categorize the person’s activities or roles.
  9. Media:

    • Links to relevant media (e.g., interviews, public statements, articles).
  10. Notes:

    • Any additional notes or comments about the person that might be relevant for the dataset.

Example of a Person Entry:

{
  "person_id": "P12345",
  "full_name": "John Doe",
  "aliases": ["JD", "Johnny"],
  "date_of_birth": "1975-06-15",
  "nationality": ["American"],
  "gender": "Male",
  "occupation": "Businessman",
  "biography": "John Doe is a prominent businessman known for his involvement in various high-profile companies. He has been linked to multiple cases of financial misconduct.",
  "photograph": "https://example.com/photos/johndoe.jpg",
  "contact_information": {
    "email": "johndoe@example.com",
    "phone_number": "+1234567890"
  },
  "addresses": [
    {
      "address": "123 Main St",
      "city": "New York",
      "state": "NY",
      "country": "USA"
    },
    {
      "address": "456 Another St",
      "city": "Los Angeles",
      "state": "CA",
      "country": "USA"
    }
  ],
  "affiliations": ["O123", "O456"],
  "related_events": ["E123", "E456"],
  "criminal_records": [
    {
      "charge": "Fraud",
      "conviction_date": "2010-05-20",
      "sentence": "5 years"
    }
  ],
  "social_media_profiles": {
    "twitter": "https://twitter.com/johndoe",
    "linkedin": "https://linkedin.com/in/johndoe"
  },
  "education": [
    {
      "institution": "Harvard University",
      "degree": "MBA",
      "graduation_year": 2000
    }
  ],
  "known_associates": ["P678", "P910"],
  "keywords": ["business", "fraud", "finance"],
  "media": [
    "https://media.example.com/interview/johndoe",
    "https://news.example.com/article/98765"
  ],
  "notes": "Has been under investigation multiple times but only convicted once."
}

Linking with Other Data Types:

  • Events Dataset:

    • Link each event with the people involved using their unique person IDs.
  • Organizations Dataset:

    • Link each organization with the people affiliated with them using their unique person IDs.

Conclusion

By structuring the People dataset with detailed and comprehensive metadata, you create a robust framework for tracking and analyzing the connections between individuals and events, as well as organizations. This detailed approach enhances the clarity and usability of the dataset for public information and investigative purposes.