You’ll need the following before you can complete the steps on this page.
Permissions — Your AWS account must have been granted access by your administrator to the AmazonMWAAFullConsoleAccess access control policy for your environment. In addition, your Amazon MWAA environment must be permitted by your execution role to access the AWS resources used by your environment.
Access — If you require access to public repositories to install dependencies directly on the web server, your environment must be configured with public network web server access. For more information, see Apache Airflow access modes.
Amazon S3 configuration — The Amazon S3 bucket used to store your DAGs, custom plugins in plugins.zip, and Python dependencies in requirements.txt must be configured with Public Access Blocked and Versioning Enabled.
On Amazon MWAA, you install all Python dependencies by uploading a requirements.txt file to your Amazon S3 bucket, then
specifying the version of the file on the Amazon MWAA console each time you update the file. Amazon MWAA runs pip3 install -r requirements.txt
to install the Python dependencies on the Apache Airflow scheduler and each of the workers.
To run Python dependencies on your environment, you must do three things:
requirements.txt
file locally.requirements.txt
to your Amazon S3 bucket.You can install Apache Airflow extras and other Python dependencies from the Python Package Index (PyPi.org), Python wheels (.whl), or Python dependencies hosted on a private PyPi/PEP-503 Compliant Repo on your environment.
The Apache Airflow Scheduler and the Workers look for the packages in the requirements.txt file and the packages are
installed on the environment at /usr/local/airflow/.local/bin
.
This section describes how to install the dependencies you uploaded to your Amazon S3 bucket by specifying the path to the requirements.txt file, and specifying the version of the requirements.txt file each time it’s updated.
If this is the first time you’re creating and uploading a requirements.txt to your Amazon S3 bucket, you also need to specify the path to the file on the Amazon MWAA console. You only need to complete this step once.
You can begin using the new packages immediately after your environment finishes updating.
You need to specify the version of your requirements.txt file on the Amazon MWAA console each time you upload a new version of your requirements.txt in your Amazon S3 bucket.
You can begin using the new packages immediately after your environment finishes updating.
You can view Apache Airflow logs for the Scheduler scheduling your workflows and parsing your dags folder. The following steps describe how to open the log group for the Scheduler on the Amazon MWAA console, and view Apache Airflow logs on the CloudWatch Logs console.
To view logs for a requirements.txt
Open the Environments page on the Amazon MWAA console.
Choose an environment.
Choose the Airflow scheduler log group on the Monitoring pane.
Choose the requirements_install_ip
log in Log streams.
You should see the list of packages that were installed on the environment at /usr/local/airflow/.local/bin
. For
example:
Collecting appdirs==1.4.4 (from -r /usr/local/airflow/.local/bin (line 1))
Downloading https://files.pythonhosted.org/packages/3b/00/2344469e2084fb28kjdsfiuyweb47389789vxbmnbjhsdgf5463acd6cf5e3db69324/appdirs-1.4.4-py2.py3-none-any.whl
Collecting astroid==2.4.2 (from -r /usr/local/airflow/.local/bin (line 2))
Review the list of packages and whether any of these encountered an error during installation. If something went wrong, you may see an error similar to the following:
2021-03-05T14:34:42.731-07:00
No matching distribution found for LibraryName==1.0.0 (from -r /usr/local/airflow/.local/bin (line 4))
No matching distribution found for LibraryName==1.0.0 (from -r /usr/local/airflow/.local/bin (line 4))