Untitled

 avatar
unknown
plain_text
14 days ago
4.5 kB
2
Indexable
1)Develop a data warehouse application for sales management using ETL tool
 Tools:
    Pentaho Spoon (PDI)
    MySQL/PostgreSQL (for DWH)
    CSV files (input)
Tables:
Fact Table:
fact_sales(sid, cid, pid, did, qty, amt)
Dimension Tables:
    dim_customer(cid, name, region)
    dim_product(pid, name, price)
    dim_date(did, date, month, year)
    dim_store(sid, location)
Steps:
1. Extract
Use CSV Input step to read sales.csv, customers.csv, etc.
2. Transform
Clean data with Select Values, Filter Rows, etc.
3. Load
Use Table Output to insert into dimension and fact tables.
Use Dimension Lookup/Update to get surrogate keys.
Jobs:
    load_customers.kjb → reads & inserts dim_customer
    load_products.kjb → reads & inserts dim_product
    load_sales.kjb → loads fact_sales
Folder:
Sales_PDI/
├── jobs/
│   ├── load_customers.kjb
│   ├── load_sales.kjb
├── trans/
│   ├── t_clean_data.ktr
├── input/
│   ├── sales.csv
├── sql/
│   └── create_tables.sql





2)Develop a data warehouse application for University management using ETL tool
Tables:
Fact Table:
fact_result(sid, cid, fid, did, marks, grade)
Dimension Tables:
    dim_student(sid, name, dept, gender)
    dim_course(cid, name, credits)
    dim_faculty(fid, name, dept)
    dim_date(did, full_date, month, year)
Steps:
1. Extract
Use CSV Input to load student, course, faculty, and result data.
2. Transform
Clean data with Select Values, Replace, Filter Rows.
3. Load
Use Table Output to insert into dim_ and fact_ tables.
Use Dimension Lookup/Update for surrogate keys.
Jobs:
    load_students.kjb
    load_courses.kjb
    load_results.kjb
Folder:
Univ_ETL/
├── jobs/
│   ├── load_students.kjb
│   ├── load_results.kjb
├── trans/
│   ├── t_clean_data.ktr
├── input/
│   ├── students.csv
│   ├── results.csv
├── sql/
│   └── create_tables.sql






3)Demonstrate all OLAP operations on multi dimensional dataset for sales data analytics using OLAP server
Dataset: Sales Cube
Dimensions:
    Time: year, quarter, month
    Product: category, sub-category, product
    Region: country, state, city
 Measures:
    Sales, Quantity, Profit
OLAP Operations
1. Roll-up (Drill-up)
Go from a detailed level to a higher level.
Example: Month → Quarter → Year
SELECT {[Measures].[Sales]} ON COLUMNS,
       {[Time].[Year].Members} ON ROWS
FROM [SalesCube]
2. Drill-down
Go from summary to more detail.
Example: Year → Quarter → Month
SELECT {[Measures].[Sales]} ON COLUMNS,
       DESCENDANTS([Time].[Year].[2024], [Time].[Month]) ON ROWS
FROM [SalesCube]
3. Slice
Select one dimension value.
Example: Only show sales in "Electronics" category
SELECT {[Measures].[Sales]} ON COLUMNS,
       [Time].[Month].Members ON ROWS
FROM [SalesCube]
WHERE ([Product].[Category].[Electronics])
4. Dice
Filter on multiple dimensions.
Example: Sales in "Electronics" AND "Q1 2024" AND "New York"
SELECT {[Measures].[Sales], [Measures].[Profit]} ON COLUMNS,
       [Product].[Sub-category].Members ON ROWS
FROM [SalesCube]
WHERE (
    [Product].[Category].[Electronics],
    [Time].[Quarter].[Q1 2024],
    [Region].[City].[New York]
)
5. Pivot (Rotate)
Swap dimensions in rows and columns.
Example: Products on columns, Time on rows
SELECT [Product].[Category].Members ON COLUMNS,
       [Time].[Month].Members ON ROWS
FROM [SalesCube]






4)demonstarate all olap operations on single dimensional dataset loaded from externam source using olap server
Dataset: Monthly Sales (1 Dimension)
Source: sales.csv
Fields:
    Month, Sales, Profit
Loaded into OLAP cube with:
    Dimension: Time (Month)
    Measures: Sales, Profit
Roll-up (Month → Quarter)
SELECT [Time].[Quarter] ON ROWS, [Measures].[Sales] ON COLUMNS FROM [SalesCube]
2. Drill-down (Quarter → Month)
SELECT Descendants([Time].[Q1], [Time].[Month]) ON ROWS, [Measures].[Sales] ON COLUMNS FROM [SalesCube]
3.Slice
SELECT 
  {[Measures].[Sales], [Measures].[Profit]} ON COLUMNS,
  {} ON ROWS
FROM [SalesCube]
WHERE ([Time].[Month].[January])

4.Dice Jan-March
SELECT {[Time].[Month].[Jan], [Time].[Month].[Feb], [Time].[Month].[Mar]} ON ROWS, [Measures].[Profit] ON COLUMNS FROM [SalesCube]
5.Pivot(Swap measures/dimension)
SELECT {[Time].[Month]} ON COLUMNS, {[Measures].[Sales], [Measures].[Profit]} ON ROWS FROM [SalesCube]





Editor is loading...
Leave a Comment