Database Notebook 1
Contents
[TOC]
Purpose of Database Management System (DBMS) is to solve problems caused by file systems
- Data redundancy and inconsistency: duplicate information, file formats
- Difficulty in accessing data: new program to carry data
- Data isolation: files and formats
- Integrity: constraints are fixed, hard to add or change
- Atomicity of updates: failures may cause inconsistency.
- Concurrent access by multiple users
- Security: permissions
Schema: the logical structure of the database
Instance: the actual content of the database at a particular point in time
Physical Data Independence: the ability to modify the physical schema without changing the logical schema
Data Manipulation Language (DML) Data Definition Language (DDL)
Attributes
- Domain of the attribute is the set of allowed values for the attribute
- values should be atomic/indivisible
- $null \in every \ domain$
Relation schema
$R = (A_1, A_2, …, A_3)$ $instructor = (ID, name , depart_name, salary)$
relation is a subset of the join of domains $r \in D_1 * D_2*…*D_n$ is also a set of n-tuples $ a_1, a_2, …, a_n$ where $a_i \in D_i$ is unordered
Normalization theory
Keys $\in $ R Super key can identify a unique tuple of r® minimal super key is candidate key one of candidate key is primary key Foreign key: value in one relation must appear in another
Relational operators / Relational Algebra
Op | Symbol | Example |
---|---|---|
Selection | $\sigma_p® = {t | t \in r\ and\ p(t) }$ $\land\ and \ \lor\ or \ \neg \ not $ |
Projection | $\Pi_{A_1, A_2, …, A_k}® \ duplicates\ tuples\ are\ removed$ | $\Pi{A, C} ® \ \Pi{ID, name, salary}(instructor)$ |
Cartesian product | $r \times s = {t\ q | t \in r \land q \in s} $ assume attributes are disjoint; rename if joint |
Union | assume same attributes | $r \cup s $ |
Difference | r - s | |
rename | $\rhoX(E) \ \rho{X(A_1, A_2, …, A_n)}(E)$ | |
Derived ops | — | — |
Intersection | $r \cap s$ | |
Nature join | associative; commutative | $r \Join s$ |
Theta Join | $r \Join\theta s = \sigma\theta(r \times s)$ | |
Assignment | $\leftarrow $ | |
Left outer join | r ⟕ s = $(r \Join s) \cup (r - \Pi_R(r \Join s) \times {(null,…, null)}) $ | |
Right outer join | r ⟖ s | |
Full outer join | r ⟗ s | |
Division | $r \div s = largest\ t(R-S)\ st.\ t\times s \in r \ temp1 \leftarrow \Pi{R-S}® \ temp2 \leftarrow \Pi{R-S}((temp1 \times s) - \Pi_{R-S, S}®) \ result = temp1 - temp2$ | |
Extended ops | ||
Generalized projection | $\Pi_{F_1, F_2, … ,F_n}(E)$ | $\Pi_{ID,salary/12}(instructor)$ |
Aggregation func | ||
avg, min, max, sum, count | ${G1,G2,…Gn}G{F_1(A_1),…,F_n(A_n)}(E)$ | |
rename | ${dept_name}G{\boldsymbol{avg}(salary)\ \boldsymbol{as}\ avg_sal}(instructor)$ |
Modification
deletion insertion updating
SQL & Relational Algebra
SQL | RA |
---|---|
$select\ A_1, A_2, …, A_n \ from\ r_1, r_2, … ,r_m \ where\ P$ | $\Pi_{A_1,…,A_n}(\sigma_P(r_1 \times … \times r_m)) $ |
$select\ A_1, A_2, sum(A_3) \ from\ r_1, r_2, … ,r_m \ where\ P \ group\ by\ A_1, A_2 $ | $_{A_1, A2}G{sum(A_3)}(\sigma_P(r_1 \times … \times r_m)) $ |
Tuple relational calculus
safety
Domain relational calculus
${
SQL
Author Chen Tong
LastMod 0001-01-01