You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a self-documentation of learning distributed data storage, parallel processing, and Linux OS using Apache Hadoop, Apache Spark and Raspbian OS. In this project, 3-node cluster will be setup using Raspberry Pi 4, install HDFS and run Spark processing jobs via YARN.
This Big Data project consists of obtaining data on vehicle theft in the city of São Paulo and consolidating it in a counting and heat map, in order to show areas with a higher index of this type of crime. All applicable in AWS Resources.
Library for remote JVM ExecutorService with only dependency being password-less SSH -- Run clustered Hadoop/Spark jobs from IDE -- IDE-pimped Spark shell with full auto-completion!