Explored Python for cybersecurity and data analysis through a portfolio project focused on downloading, cleaning, and visualizing real-world datasets. Demonstrated ability to parse CSV/JSON/GeoJSON data and generate both static and interactive visualizations using Matplotlib and Plotly. The project also applies refactoring, error handling, and reproducibility practices (virtual environments, requirements file, structured repo). Cybersecurity-transferrable skills:
- Parsing and cleaning log files, vulnerability scan results, and breach datasets.
- Accessing and processing security APIs (e.g., VirusTotal, HaveIBeenPwned, Shodan).
- Building visualizations of attack trends to communicate risks to stakeholders.
- Applying refactoring and error handling that mirror practices in secure coding.
👉 Recruiters and hiring managers: This project demonstrates my ability to work with real-world data and lays the groundwork for automation and analysis in cybersecurity contexts.
This project demonstrates how to ingest, process, and visualize structured and unstructured data using Python. It includes:
- Parsing and plotting weather data from CSV files.
- Downloading JSON/GeoJSON datasets (earthquakes, wildfires).
- Applying data cleaning, error checking, and refactoring.
- Generating visualizations with Matplotlib amd Plotly, including static and interactive charts.
- Data parsing from CSV, JSON, and GeoJSON
- Dataset exploration
- Time-series visualization with datetime
- Plot customization (colors, scales, shading)
- Global data mapping (earthquakes, fires)
- Refactoring with list comprehensions and automated headers
- Professional Git/GitHub workflow (branching, version control,
.gitignore)
project-data-visualization/
│
├── data/ # Input CSV/JSON/GeoJSON datasets
│ ├── earthquake_data/
│ │ ├── eq_data_1_day_m1.geojson
│ │ ├── eq_data_30_day_m1.geojson
│ │ ├── eq_data_past_30_days_m4plus.geojson
│ │ ├── readable_eq_data_geojson
│ │ └── README.md
│ ├── weather_data/
│ │ ├── death_valley_2021_full.csv
│ │ ├── death_valley_2021_simple.csv
│ │ ├── greater_seattle_2024_dense.csv
│ │ ├── README.md
│ │ ├── redmond_wa_2024_simple.csv
│ │ ├── sitka_weather_07-2021_simple.csv
│ │ ├── sitka_weather_2021_full.csv
│ │ └── sitka_weather_2021_simple.csv
│ ├── wildfire_data/
│ │ ├── world_fires_1_day.csv
│ │ └── world_fires_7_day.csv
│ └── README.md
│
├── images/portfolio/ # Exported static plots
│ ├── Recent_Earthquakes.png
│ ├── Sitka_Death_Valley_Comparison.png
│ └── World_Wildfires.png
│
├── src/ # Python scripts for each exercise
│ ├── 16_1_death_valley_rainfall.py
│ ├── 16_1_sitka_rainfall.py
│ ├── 16_2_sitka_death_valley_comparison.py
│ ├── 16_4_automatic_indexes.py
│ ├── 16_6_refactoring.py
│ ├── 16_7_automated_title.py
│ ├── 16_8_recent_earthquakes.py
│ ├── 16_9_world_fires.py
│ ├── death_valley_highs_lows.py
│ ├── eq_explore_data.py
│ ├── main.py
│ ├── redmond_wa_rain_snow_december_2024.py
│ ├── sitka_highs.py
│ └── sitka_highs_lows.py
│
├── tests/
├── .gitattributes
├── .gitignore
├── LICENSE
├── README.md
└── requirements.txt
-
Clone this repo and move into the folder:
git clone git@github.com:your-username/project-data-visualization.git cd project-data-visualization -
(Optional) Create a virtual environment:
python3 -m venv .venv source .venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Run any exercise script, for example:
python src/16_8_recent_earthquakes.py
This project highlights skills in:
- Python scripting for data processing
- CSV, JSON, and GeoJSON importing and processing
- Data visualization using Matplotlib and Plotly
- Working with real-world datasets (climate, seismic, fire activity)
- Applying PEP 8, project structuring, and GitHub portfolio practices
Through this project, I strengthened both my Python skills and their application to cybersecurity:
- Data parsing & cleaning → transferable to analyzing log files, firewall exports, and SIEM data.
- Datetime handling → useful for correlating events across multiple security data sources.
- Visualization (Matplotlib & Plotly) → critical for reporting incidents and trends to executives or compliance auditors.
- Error handling & validation → aligns with defensive programming principles in security tools.
- Refactoring & automation → prepares me to build reusable scripts for common security workflows.
Problems, errors, and issues overcome during this project:
- Gained hands-on experience resolving Git merge conflicts and managing divergent branches.
- Learned how to configure and update branch protection rules to balance collaboration with control.
- Improved workflow efficiency by practicing PyCharm Git integration and fallback to the CLI when needed.
- Built resilience by troubleshooting environment and formatting issues in VS Code and PyCharm.
- Reinforced the importance of clear data directory structures for consistent project organization.
- Version Control: Git branching, conflict resolution, and pull request management.
- Tool Proficiency: PyCharm, VS Code, GitHub Desktop, and CLI.
- Problem-Solving: Overcame environment friction and repo cleanup challenges.
- Cybersecurity Relevance: Applied structured, methodical approaches to error handling and configuration—transferable to securing and maintaining resilient systems.
This project helped me bridge Python fundamentals with practical cybersecurity applications, positioning me to develop tools that improve detection, response, and reporting.
- 16-1: Sitka Rainfall → Visualized daily rainfall (Sitka & Death Valley).
- 16-2: Sitka–Death Valley Comparison → Standardized y-axes for temperature comparisons.
- 16-4: Automatic Indexes → Automated detection of CSV header indexes and titles.
- 16-6: Refactoring → Simplified earthquake data parsing with list comprehensions.
- 16-7: Automated Title → Dynamically pulled dataset titles from GeoJSON metadata.
- 16-8: Recent Earthquakes → Visualized past 30 days of earthquake data (incl. 2025 Kamchatka quake).
- 16-9: World Fires → Plotted NASA FIRMS fire data with intensity-based opacity and color scaling.
- NOAA Climate Data — temperature and rainfall data for Sitka, Alaska, and Death Valley, California. ncdc.noaa.gov
- USGS Earthquake Hazards Program — real-time and historical earthquake data in GeoJSON format: earthquake.usgs.gov
- NASA Earthdata (FIRMS) — global active fire data: earthdata.nasa.gov/firms
- Datasets and exercises adapted from Python Crash Course, 3rd Edition by Eric Matthes (No Starch Press).
Based on exercises from:
Matthes, E. (2023). Python Crash Course (3rd ed.). No Starch Press.
Book website
Distributed under the MIT License.
See LICENSE for details.