![Data Warehouse](https://200zi.50zi.cn/uploadfile/img/16/2/66cba171b877ca0ccaf96c8ae81b94e1.jpg)
Knowledge Discovery Process
OLTP & OLAP
联机事务处理(OLTP, online transactionalprocessing)系统:涵盖组织机构大部分的日常操作,purchasing, inventory, banking,manufacturing, payroll, registration, accounting
联机分析处理(OLAP, online analyticalprocessing)系统:以不同的格式组织和提供数据,以满足不同用户的各种需求,为数据分析和决策方面提供服务。 Distinct features (OLTP vs. OLAP): User and system orientation: customer vs. market Data contents: current, detailed vs. historical, consolidated View: current, local vs. evolutionary, integrated Access patterns: update vs. read-only but complex queries
Data Warehouse
DBMS— tuned for OLTP: access methods, indexing, concurrency control,recovery
Warehouse—tuned for OLAP: complex OLAP queries, multidimensionalview, consolidation Data Warehouse: 数据仓库将分布在企业网络中不同信息岛上的业务数据集成到一起,存储在一个单一的集成关系型数据库中,利用这样的集成信息,可方便用户对信息访问,可使决策人员对一段时间内的历史数据进行分析,研究事务的发展走势。
A data warehouse is a subject-oriented, integrated,time-variant, and nonvolatile collection of data in supportof management’s decision-making process.” — W. H.Inmondata stored in data warehouse has been processedafter extracation, cleaning, transformation, load(sort, summarize...) and refresh.Data Warehouse model :
dimensions and measures, you can locate some data by dimension and see the data by measures Conception model :
star schema,
snowflake schema(a refinement of star schema),
fact constellations(a collection of stars) Example of Star Schema: Typical OLAP Operations :
Roll up: summarize data by climbing up hierarchy or by dimension reduction, you can roll up to all to reduce a dimension
Dill down:reverse of Roll-up, from higher level summaryto lower level summary or detailed data
Slice and dice:project and select
Priot(rotate):reorient the cube, visualization, 3D to series of 2D planes.
参考
中国科学院大学《数据挖掘》课程slices