Introduction

When building an open-source intelligence dataset focusing on events and their links to people and organizations, it’s essential to ensure that the data is comprehensive, structured, and easily interpretable. Here’s a detailed breakdown of the key metadata and event data types you should consider recording:

Core Event Data:

  1. Event ID:

    • A unique identifier for each event.
  2. Event Type:

    • A classification of the event (e.g., meeting, arrest, transaction, attack).
  3. Event Name/Title:

    • A descriptive name or title for the event.
  4. Date and Time:

    • Start and end dates/times of the event. Include time zone information if available.
  5. Location:

    • Detailed location information including address, city, state, country, and geocoordinates (latitude and longitude).

Metadata for Context and Details:

  1. Description:

    • A detailed description of the event, including any relevant context and background information.
  2. Sources:

    • References or links to the sources of information about the event (e.g., news articles, official reports, databases).
  3. Participants:

    • List of people involved in the event. Each participant can be linked to a unique ID in the people dataset.
  4. Organizations:

    • List of organizations involved in the event. Each organization can be linked to a unique ID in the organizations dataset.
  5. Event Outcome:

    • The result or consequence of the event (e.g., successful, failed, ongoing).

Additional Data for Depth and Analysis:

  1. Event Category:

    • Higher-level categorization of the event (e.g., political, criminal, economic).
  2. Related Events:

    • Links to other events that are related or connected (e.g., follow-up events, precursor events).
  3. Impact:

    • The impact or significance of the event, possibly quantified or categorized (e.g., high, medium, low impact).
  4. Keywords/Tags:

    • Keywords or tags that help in categorizing and searching for events.
  5. Media:

    • Links or attachments to relevant media (e.g., photos, videos, documents).

Examples and Implementation:

Example of an Event Entry:

{
  "event_id": "E12345",
  "event_type": "arrest",
  "event_name": "High-Profile Arrest",
  "date_time": {
    "start": "2024-05-17T08:30:00Z",
    "end": "2024-05-17T10:00:00Z"
  },
  "location": {
    "address": "123 Main St",
    "city": "Calgary",
    "state": "AB",
    "country": "Canada",
    "geocoordinates": {
      "latitude": 51.0447,
      "longitude": -114.0719
    }
  },
  "description": "The arrest of a high-profile individual on charges of corruption.",
  "sources": [
    "https://news.example.com/article/12345",
    "https://officialreport.example.com/doc/67890"
  ],
  "participants": ["P123", "P456"],
  "organizations": ["O789", "O101"],
  "event_outcome": "successful",
  "event_category": "criminal",
  "related_events": ["E54321"],
  "impact": "high",
  "keywords": ["arrest", "corruption", "high-profile"],
  "media": [
    "https://media.example.com/photo/123",
    "https://media.example.com/video/456"
  ]
}

Linking with People and Organizations:

  • People Dataset:

    • person_id, name, aliases, date_of_birth, nationality, roles, bio, etc.
  • Organizations Dataset:

    • organization_id, name, aliases, type, location, description, etc.

Conclusion

By structuring your event data with comprehensive metadata and ensuring clear links to related entities (people and organizations), your dataset will be more useful for analysis and understanding the relationships and impacts of various events. This approach will make your open-source intelligence dataset robust, transparent, and valuable for public information purposes.