/opt/ml/ contains subfolders:
input/ for configuration and data.
config/ for hyperparameters.
hyperparameters.json for hyperparameters.resourceconfig.json for instance type and count.data/ for training and test data.
channels/ for multiple data channels.code/ for scripts.model/ for trained models.output/ for failure captures.
failure/ for failure logs.Scikit-Learn Training and Serving Example
I had an error about numexpr package. I fixed it with below command:
!pip install numexpr==2.8.0 --upgrade
I had an error in this line: (I think it’s related to sagemaker versions)
from sagemaker.predictor import csv_serializer
predictor = tree.deploy(1, "ml.m4.xlarge", serializer=csv_serializer)
I had to change it to:
from sagemaker.serializers import CSVSerializer
csv_serializer = CSVSerializer()
predictor = tree.deploy(1, "ml.m4.xlarge", serializer=csv_serializer)
I had another error in this line:
transformer.transform(
data_location,
content_type="text/csv",
split_type="Line",
input_filter="$[1:]"
)
transformer.wait()
ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateTransformJob operation: The account-level service limit 'ml.m4.xlarge for transform job usage' is 0 Instances, with current utilization of 0 Instances and a request delta of 1 Instances. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota.
I had to request an increase for ml.m5.xlarge for transform job usage quota. See this link for more details.