GitHub Connector for eDiscovery & Data Collection
Collect GitHub issues, pull requests, and repository metadata for eDiscovery, compliance, and investigations without waiting on IT. The Onna + GitHub Connector lets you access developer activity data across your organization on a defensible, unified platform.
Why Connect GitHub to Your eDiscovery Collections Platform
GitHub is where development teams store code, track work, and collaborate on projects. When legal or compliance needs arise, the activity data living in repositories — issues filed, pull requests opened, labels applied — has to be accessible, complete, and defensible.
Without the right tools, collecting from GitHub is harder than it should be because:
Activity is spread across multiple repositories and projects
Issues and pull requests may involve many contributors
Individual user credentials are required for collection
Two-factor authentication adds complexity to the collection workflow
The Onna + GitHub Connector enables organizations to collect this data in a defensible and scalable way. Through GitHub's API, the connector extracts issues and pull requests across repositories while preserving metadata and activity history.
GitHub Connector Capabilities
The Onna + GitHub Connector is designed for the enterprise-scale collection of developer activity data.
Key capabilities include:
Direct connection to GitHub via API
Full archive support
Support for one-time and auto-sync modes
Repository-level collection controls
Audit logs for all collection activity
Two-factor authentication (2FA) support
Metadata preservation alongside collected content
These capabilities allow organizations to perform targeted collections from specific repositories or maintain ongoing archives of GitHub activity.
What Data Can Be Collected from GitHub
The connector captures issues, pull requests, and associated metadata from GitHub repositories including:
Developer Activity
All issues
All pull requests
These collections preserve the structure and context of files and metadata so investigators can reconstruct data provenance accurately.
GitHub Metadata Collected
Alongside file content, the connector captures key metadata fields including:
Metadata
Path to original file
Repository name
Labels
List of creators
File extension
File size
File last modified date
These collections preserve the structure and context of repository activity so investigators can reconstruct development timelines and identify key contributors accurately.
Note: Onna collects issues and pull requests only. The actual code files within a repository are not synced.
How GitHub Data Collection Works
The connector simplifies GitHub data collection through a structured workflow.
Add GitHub as a data source
Navigate to your workspace and add GitHub as a source.
Authenticate the connection
Complete the OAuth flow by entering the GitHub username and password for the user being collected, then sign in.
Configure the collection
Once authenticated, define your collection settings including:
- Collection name
- Sync mode (one-time or auto-sync)
Select repositories
Click "Get repositories" to load available repositories, then select which ones to include in the collection.
Start sync
Once configuration is complete, the GitHub collection begins and data appears within your Onna workspace.
GitHub Data Collection Options
The Onna + GitHub Connector supports flexible sync modes depending on investigation needs.
One-Time Sync
A targeted collection used for litigation or investigations with a defined scope.
Auto-Sync
Automatically collects new issues and pull requests as they are created across connected repositories.
Common GitHub eDiscovery Use Cases
Litigation Response
Collect GitHub issues and pull requests relevant to legal matters quickly and defensibly.
Regulatory Compliance
Archive GitHub activity records to meet regulatory data retention requirements.
Internal Investigations
Identify repository activity, contributors, and labeled issues related to incidents or policy violations.
Onna + GitHub Connector FAQs
No. Onna collects issues and pull requests only, along with their associated metadata. The source code files within a repository are not synced.
Yes. During setup, you can select multiple repositories to include in a single collection by checking the names of those you'd like to sync.
You will need the individual login credentials — username and password — for each GitHub user being collected. Admin-level access is not required.
Yes. If your organization has enabled 2FA, Onna will make it part of the synchronization workflow automatically.
Yes. Onna maintains a comprehensive audit log of all preservations, collections, and user actions. Every collection has a documented chain of custody.
Start Collecting GitHub Data for eDiscovery
Connect GitHub in minutes and begin collecting issues, pull requests, and repository activity data from across your organization.

%201.webp)

%201.webp)