Automated OSINT with Python – Tools, Tactics, and Full Scripts
by darkhorse10123 - Friday January 24, 2025 at 05:44 AM
#1
In today’s information-driven world, manually sifting through data sources can be both time-consuming and error-prone. Python is a powerful language that, combined with public APIs and lightweight scraping techniques, can help automate large parts of the OSINT (Open-Source Intelligence) workflow.
This thread provides:
  1. [b]An Overview of Automated OSINT Use Cases[/b]
  2. [b]A Step-by-Step OSINT Python Script[/b]
  3. [b]Tips for Legitimate and Ethical Usage[/b]
[b]Disclaimer: This tutorial is for educational and legal intelligence-gathering purposes only. Always ensure you comply with all relevant local laws and regulations.[/b]

[b]1. Automated OSINT Use Cases[/b]
  1. [b]Social Media Monitoring
    [/b]
    • Twitter, Reddit, Telegram, etc.
    • Gathering targeted search results or mentions.
  2. [b]Data Breach & Credential Checking
    [/b]
    • Publicly available breach databases or sites offering advanced search (e.g., haveibeenpwned.com).
  3. [b]Company or Domain Reconnaissance
    [/b]
    • Scraping corporate blogs, official press releases, subdomains, or online assets.
  4. [b]Geolocation & Map Intelligence
    [/b]
    • Integrating geospatial data from sources like OpenStreetMap.
  5. [b]Metadata Extraction
    [/b]
    • Analyzing file or image metadata (EXIF) to uncover hidden info such as GPS coordinates, device details, or timestamp manipulations.

[b]2. Step-by-Step OSINT Python Script[/b]
Below is a [b]complete Python script that demonstrates a few basic OSINT tasks:[/b]
  • Gathering social media data (with publicly accessible endpoints or official APIs).
  • Checking whether an email or username appears in a known data breach site (public, free-tier usage).
  • Doing a basic domain/subdomain lookup.
Note: Replace API keys or tokens with real credentials where needed. This sample uses free or demo endpoints where available.


==================================================================
==================================================================
==================================================================
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Automated OSINT Example Script
Author: YourName
Date: 2025-01-24
Description: Demonstrates how to automate a simple OSINT process:
1. Gather social media mentions for a keyword (Twitter API v2 example).
2. Check if an email/username has appeared in known breaches (HIBP or similar).
3. Perform a domain/subdomain reconnaissance with a public API.

Disclaimer: This script is for EDUCATIONAL and LEGAL OSINT usage ONLY.
"""

import requests
import json
import time
import sys

# -------------
# Configuration
# -------------
TWITTER_BEARER_TOKEN = "YOUR_TWITTER_BEARER_TOKEN"  # https://developer.twitter.com/en/docs/au.../oauth-2-0
HIBP_API_KEY = "YOUR_HIBP_API_KEY"                  # https://haveibeenpwned.com/API/Key
SECURITYTRAILS_API_KEY = "YOUR_SECURITYTRAILS_API_KEY"  # https://securitytrails.com/corp/apidocs
# You can sign up for a free-tier or trial key on these sites.
# If you do not have a key, you can skip or comment out the relevant parts of the script.

# -------------
# Helper Functions
# -------------
def search_twitter(keyword, max_results=10):
    """
    Searches recent tweets containing the given keyword.
    Returns a list of tweet objects (or an empty list if no results).
    """
    url = "https://api.twitter.com/2/tweets/search/recent"
    headers = {
        "Authorization": f"Bearer {TWITTER_BEARER_TOKEN}"
    }
    params = {
        "query": keyword,
        "max_results": max_results
    }
   
    try:
        resp = requests.get(url, headers=headers, params=params, timeout=10)
        resp.raise_for_status()
        data = resp.json()
        return data.get("data", [])
    except Exception as e:
        print(f"[!] Error during Twitter search: {e}")
        return []

def check_breach(email):
    """
    Checks if an email or username has been found in known data breaches
    using the Have I Been Pwned (HIBP) API. Returns a list of breaches.
    """
    url = f"https://haveibeenpwned.com/api/v3/breachedaccount/{email}"
    headers = {
        "hibp-api-key": HIBP_API_KEY,
        "User-Agent": "OSINT-Script"
    }
   
    try:
        resp = requests.get(url, headers=headers, timeout=10)
        if resp.status_code == 200:
            return resp.json()  # List of breach objects
        elif resp.status_code == 404:
            # 404 means no breach found for that email
            return []
        else:
            print(f"[!] Unexpected status code from HIBP: {resp.status_code}")
            return []
    except Exception as e:
        print(f"[!] Error during breach check: {e}")
        return []

def get_subdomains(domain):
    """
    Retrieves subdomains for a given domain using the SecurityTrails API.
    """
    url = f"https://api.securitytrails.com/v1/domain/{domain}/subdomains"
    headers = {
        "APIKEY": SECURITYTRAILS_API_KEY
    }
    try:
        resp = requests.get(url, headers=headers, timeout=10)
        resp.raise_for_status()
        data = resp.json()
        return data.get("subdomains", [])
    except Exception as e:
        print(f"[!] Error retrieving subdomains: {e}")
        return []

# -------------
# Main OSINT Workflow
# -------------
def main():
    # 1. Gather Twitter data
    keyword = input("Enter a keyword or hashtag to search on Twitter: ")
    tweets = search_twitter(keyword)
    print(f"\n[+] Found {len(tweets)} tweets related to '{keyword}':")
    for idx, tweet in enumerate(tweets, start=1):
        print(f"{idx}. {tweet.get('text', 'No text found')}")

    # 2. Check for data breaches
    target_email = input("\nEnter an email/username to check for known breaches: ")
    breaches = check_breach(target_email)
    if breaches:
        print(f"[!] {target_email} found in these breaches:")
        for b in breaches:
            print(f"    - {b.get('Name', 'Unknown breach')}")
    else:
        print(f"[+] No breaches found for {target_email} (or not listed).")

    # 3. Domain subdomain lookup
    domain_name = input("\nEnter a domain name to find subdomains (e.g., example.com): ")
    subdomains = get_subdomains(domain_name)
    if subdomains:
        print(f"[+] Subdomains for {domain_name}:")
        for s in subdomains:
            print(f"    - {s}.{domain_name}")
    else:
        print("[+] No subdomains found or unable to retrieve.")

if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\n[!] Script interrupted by user.")
        sys.exit(1)
    except Exception as e:
        print(f"[!] An error occurred: {e}")
        sys.exit(1)

==================================================================
==================================================================
==================================================================
How the Script Works
  1. search_twitter(keyword):
    • Uses Twitter’s recent search endpoint (API v2).
    • Requires a Bearer Token (Twitter Developer account).
  2. check_breach(email):
    • Checks if the given email/username appears in a known data breach.
    • Uses the Have I Been Pwned API (free tier).
  3. get_subdomains(domain):
    • Retrieves subdomains from the SecurityTrails API.
    • Uses a free or trial API key.
  4. main():
    • Collects user input for a Twitter keyword, an email for breach checks, and a domain for subdomain lookups.
    • Displays all fetched info in a straightforward format.

3. Tips for Legitimate & Ethical Usage
  1. Respect Privacy & Terms of Service
    • Always ensure you have permission before you gather or scrape data.
    • Comply with each data source’s Terms of Service (ToS).
  2. Avoid Storing Sensitive Data Unnecessarily
    • If you log or store results, secure them properly and respect the data subject’s privacy.
  3. Automate Wisely
    • Large-scale or repeated requests can cause rate-limit bans or attract unwanted attention.
    • Use robust error handling and respect server load (throttling or sleep intervals).
  4. Stay Updated
    • APIs change frequently. Check official docs for updated endpoints, required parameters, and authentication methods.

Conclusion
This thread provides a solid foundation for building automated OSINT workflows using Python. You can tailor the script to your specific requirements, integrate more APIs (social media, domain intelligence, geolocation, etc.), and add advanced features like:
  • Natural Language Processing (NLP) for content analysis.
  • Automatic report generation in PDF or HTML.
  • Integration with databases (e.g., SQLite, PostgreSQL) for storing OSINT findings.
Feel free to share your modifications, additional OSINT resources, or new scripts. Together, we can maintain a curated collection of practical, legally compliant OSINT techniques!
Happy Hunting!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Telegram Investigation and OSINT Tools lulagain 142 3,272 1 minute ago
Last Post: AzazelTeam
  COLLECTION Telegram Bots for OSINT Loki 1,021 35,457 2 hours ago
Last Post: kamanasky888
  COLLECTION OSINT Tools Used By ZachXBT Sythe 144 2,757 3 hours ago
Last Post: kamanasky888
  COLLECTION {FREE} Phone Numbers OSINT Tools lulagain 461 13,598 7 hours ago
Last Post: alfa12op
  COLLECTION OSINT RESOURCES BY COUNTRY lulagain 1,574 54,142 7 hours ago
Last Post: mrlaoban

Forum Jump:


 Users browsing this thread: 1 Guest(s)