It’s a good practice to plan the steps to transform your data. Based on the information we captured during data exploration stage, we can come up with the following transformation step:
Go to Glue console.
In the left navigation panel, click ETL jobs.
On the AWS Glue Studio page, click Visual ETL.

Adding Yellow Trips data from Amazon S3
S3://{RAW_BUCKET}/nyc-taxi/taxi_zone_lookup/
Modify data types

Save transformed data to Amazon S3
S3://{Standardize_BUCKET}/taxi_zone_lookup/
Set job detail
Specify Iam role

Run job
Click Run.

Check output
Go to standardize bucket in S3 console.

Adding Yellow Trips data from Amazon S3
S3://{RAW_BUCKET}/nyc-taxi/yellow_tripdata/
Modify data types

Save transformed data to Amazon S3
S3://{Standardize_BUCKET}/yellow_tripdata/
Set job detail
Specify Iam role

Run job
Click Run.

Check output
Go to standardize bucket in S3 console.

Go to the AWS Glue Console.
In the left navigation menu, click Crawlers.
On the Crawlers page, select your crawler, and then click Run crawler.

In the left navigation menu, click Tables.

On the Tables page, click on table name to review the table metadata and schema information.



Amazon Athena automatically stores query results and metadata information for each query that runs in a query result location that you can specify in Amazon S3. If necessary, you can access the files in this location to work with them. You can also download query result files directly from the Athena console.
Go to Athena console. Click Get Started

Choose Edit Settings, click on Browse S3 and select bucket as the value for the Location of query result - optional field.


Go to the top menu, click on Editor to return back to the Query editor page.
Choose database
Choose table
Choose preview table

