upload_file()
command:client.upload_file('utils.py')
If new workers are activated by the scheduler, they lose the previously uploaded scripts. So, we must either somehow use the upload_file()
command again or just stop scaling workers.
When using Dask's Jupyter Lab extension, the workers are using the Python environment where the instantiated Jupyter Lab is installed. So, to make sure that the workers are using the right virtual environment, Jupyter Lab must be started from within the desired virtual environment's shell.
source /Users/andreferreira/Library/Caches/pypoetry/virtualenvs/eicu-mortality-prediction-py3.7/bin/activate
jupyter lab --no-browser
ValueError: Error converting column "col_name" to bytes using encoding UTF8. Original error: bad argument type for built-in operation
What must be done is to specify the most adequate data type format for each of the troublesome columns, using the astype()
method. In case the saving to parquet still fails, try switching between the fastparquet and pyarrow packages or changing the parameter object_encoding
to a different setting.# Set some mixed data type columns straight, by defining their most adequate representation
nursechart_df['Pain Score/Goal'] = nursechart_df['Pain Score/Goal'].astype(str)
nursechart_df['Glasgow coma score'] = nursechart_df['Glasgow coma score'].astype(str)
nursechart_df['Pain Score'] = nursechart_df['Pain Score'].astype(float)
nursechart_df['GCS Total'] = nursechart_df['GCS Total'].astype(str)