联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2020-07-09 10:53

CMPT 454 Assignment 3: Query Evaluation

This assignment is worth approximately 7% of your final grade.

Question 1

For each of the operations described below you are to calculate the number of disk reads (and

writes) to perform the operation in the most efficient way. Do not include the cost to write out

the result of the operation. Briefly explain the process of each operation and its components.

For example, use the index on a, read x nodes of the index and y records of the file.

Assume the root node of tree indexes and directory of extensible hash indexes are not held in

main memory at the start of the operation. [2 marks each, unless noted otherwise]

Product = {pid, pname, ptype, pnumber, description, manufacturer, mcountry, mcity, price}

Table Statistics (note for all V the table is the Product table)

B(Product) T(Product) V(pid) V(pname) V(ptype) V(pnumber)

25,000 250,000 250,000 50,000 500 1,000

V(mcountry) V(mcity) V(manufacturer) V(price) V(description)

40 1,000 10,000 50,000 125,000

V({ptype, pnumber}) V({mcountry, mcity})

250,000 2,500

Indexes

? Primary dense B+ tree index on {ptype, pnumber} where ptype is the prefix of the search key.

The index is height 4 (includes leaf and root level) and interior and leaf nodes contain 50

search keys on average.

? B+ tree index on {mcountry, mcity} where mcountry is the prefix of the search key. The index

is height 5 (includes leaf and root level) and interior and leaf nodes contain 30 search keys on

average.

? B+ tree index on pname. The index is height 4 (includes leaf and root level) and interior and

leaf nodes contain 60 search keys on average.

? Extensible hash index on pid. The directory resides on two disk blocks and each bucket

contains 100 search keys on average.

? Linear hash index on manufacturer. Each bucket contains 40 search keys on average – there

are no overflow blocks.

a) ?(ptype = 'ABC' ? pnumber = 13) (Product)

b) ?(ptype = 'ABC') (Product)

c) ?(pid = 123456 ? pid = 678324) (Product)

d) ?(mcountry = 'UK' ? mcity = 'Sheffield') (Product)

e) ?(mcity = 'Detroit') (Product)

f) ?(pname = 'foo345' ? manufacturer = 'acme') (Product)

g) ?(mcountry = 'Canada' ? price > 25.00 ? description = 'sweet widget) (Product)

h) ?(pname = 'bar111' ? price = 73.80 ? ptype = 'TLR') (Product)

i) ?( (pname = 'foo17' ? price = 19.99) ? (ptype = 'HJK' ? price = 19.99) ? (ptype = 'HJK' ? pid = 432911) ) (Product)

j) Sort the Product table assuming there are 100 main memory frames available

k) ?(ptype, description) (Product) – assume there are 50 main memory frames available, and that

duplicates are not to be removed

l) ?(ptype, description) (Product) – assume there are 50 main memory frames available, and that

duplicates are to be removed using sort projection; also assume that duplicates are only

encountered in the final stage of the process [4 marks]

Note that ptype is 4 bytes, description is 96 bytes and pages are 4,096 bytes

Question 2

Answer the following questions about performing a

natural join between the Patient and Visit tables. The

only attribute the two tables have in common is msp,

which is the primary key of Patient and a foreign key in

Visit. Relevant information is shown to the right. [15]

a) How many records will the joined relation contain? [1]

b) Approximately how many records of the joined relation fit on a single block? [1]

c) What is the main memory requirement (in frames) to perform the join in two passes using

the sort-join algorithm? Briefly explain your calculation. [2]

d) If Patient was already sorted on msp would your answer to part (b) change? Explain why or

why not? [2]

e) What is the main memory requirement (in frames) to perform the join in two passes using

the hash-join algorithm? Briefly explain your calculation. [2]

f) Assume that there are approximately 1,200 frames available for performing the join; what is

the cost of performing the join using the block nested loop join algorithm? Briefly explain your

calculation. [3]

g) Assume that there is a secondary extensible hash index on msp in Patient. If the directory

page of the index is retained in main memory, what is the cost of performing the join using

the index nested loop join algorithm? Briefly explain your calculation. [3]

h) Assume that there is a primary B+ tree index of height 3 on msp in Visit. If the root node of

the index is retained in main memory, what is the cost of performing the join using the index

nested loop join algorithm? Briefly explain your calculation. [3]

i) Assume that there are just over 4,000 frames available for performing the join; what is the

cost of performing the join using the hybrid hash join algorithm where one partition of the

outer relation is to be retained in main memory? Briefly explain your calculation. [3]

Visit Patient

T(R) 1,800,000 180,000

B(R) 180,000 18,000

V(R, msp) 180,000 180,000


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp