Skip to content

An Apache Spark project analyzing NYC Yellow Taxi trip data to uncover trends, forecast fares, and visualize geospatial insights.

Notifications You must be signed in to change notification settings

Milanka00/NYC-Taxi-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 

Repository files navigation

nyc-taxi-spark-analysis

An Apache Spark project analyzing NYC Yellow Taxi trip data to uncover trends, forecast fares, and visualize geospatial insights.

Dataset: https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page

Data Dictionary

Field Name Description
VendorID A code indicating the TPEP provider that provided the record.
1 = Creative Mobile Technologies, LLC
2 = VeriFone Inc.
tpep_pickup_datetime The date and time when the meter was engaged.
tpep_dropoff_datetime The date and time when the meter was disengaged.
Passenger_count The number of passengers in the vehicle.
This is a driver-entered value.
Trip_distance The elapsed trip distance in miles reported by the taximeter.
Pickup_longitude Longitude where the meter was engaged.
Pickup_latitude Latitude where the meter was engaged.
RateCodeID The final rate code in effect at the end of the trip:
1 = Standard rate
2 = JFK
3 = Newark
4 = Nassau or Westchester
5 = Negotiated fare
6 = Group ride
Store_and_fwd_flag Indicates whether the trip record was held in vehicle memory before sending to the vendor (“store and forward”).
Y = store and forward trip
N = not a store and forward trip
Dropoff_longitude Longitude where the meter was disengaged.
Dropoff_latitude Latitude where the meter was disengaged.
Payment_type A numeric code signifying how the passenger paid:
1 = Credit card
2 = Cash
3 = No charge
4 = Dispute
5 = Unknown
6 = Voided trip
Fare_amount The time-and-distance fare calculated by the meter.
Extra Miscellaneous extras and surcharges, e.g., $0.50 and $1 rush hour/overnight charges.
MTA_tax $0.50 MTA tax automatically triggered based on the metered rate in use.
Improvement_surcharge $0.30 surcharge assessed at the flag drop, introduced in 2015.
Tip_amount Tip amount (automatically recorded for credit card tips; cash tips not included).
Tolls_amount Total amount of all tolls paid during the trip.
Total_amount The total amount charged to passengers (excluding cash tips).

About

An Apache Spark project analyzing NYC Yellow Taxi trip data to uncover trends, forecast fares, and visualize geospatial insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •