Untitled

dsai-lake/
├── environments/
│   ├── production/
│   │   ├── experiments/
│   │   │   └── <experiment_name>/                # e.g., fraud_risk_Retail_ES
│   │   │       ├── metadata.yaml                 # Includes: scope, location, vertical, name, environment, aws_account, region, pii_handling details
│   │   │       └── runs/
│   │   │           └── <run_id>/
│   │   │               ├── metadata.yaml         # Contains: id, name, hyperparameters, input/output schemas, data versioning, metrics, training library, code_version, start_time, end_time, environment tag
│   │   │               ├── training/
│   │   │               │   ├── input/            # Training data source; include a log of PII transformations (masking, nulling)
│   │   │               │   ├── output/           # Artifacts, results, and automated data quality checks report
│   │   │               │   └── queries.yaml        # Captures queries executed, user details, and timestamps
│   │   │               └── logs/                   # Detailed logs including audit trails for PII and system events
│   │   ├── models/
│   │   │   └── <model_name>/                      # e.g., fraud_risk_Retail_ES
│   │   │       ├── metadata.yaml                  # Contains: scope, location, vertical, name, environment, aws_account, region, and reference to PII handling standards
│   │   │       └── versions/
│   │   │           └── <version_number>/
│   │   │               ├── metadata.yaml          # Inherits run and model metadata; adds version number and stage (None, Staging, Production, Archived)
│   │   │               └── artifact/              # The trained model artifact(s)
│   │   └── inference/
│   │       └── scores/
│   │           ├── <inference_date>/              # Organize by date or version for historical tracking
│   │           │   └── metadata.yaml              # Includes: inference metrics, timestamp, model version, environment tag, and validation status
│   │           └── logs/                          # Logs for inference process, including data quality and PII handling checks
│   └── staging(dev)/
│       └── [Structure mirrors production]        # Separate structure for staging to test before production roll-out
├── shared/
│   └── pii_config.yaml                           # Centralized configuration for PII: whitelisted columns, masking rules, and country-specific requirements
└── docs/
    ├── retention_policy.md                       # Documentation on retention and audit policies
    └── data_governance.md                        # Guidelines for data quality, environment management, and compliance (PII, audit trails)
Editor is loading...