Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Digital Data Must be - Data Structures - Lecture Slides | CMSC 420, Study notes of Data Structures and Algorithms

Material Type: Notes; Class: Data Structures; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-cdn
koofers-user-cdn 🇺🇸

10 documents

1 / 32

Related documents


Partial preview of the text

Download Digital Data Must be - Data Structures - Lecture Slides | CMSC 420 and more Study notes Data Structures and Algorithms in PDF only on Docsity! CMSC 420 Data Structures Lecture 1: Introduction Digital Data gatcttttta tttaaacgat ctctttatta gatctcttat taggatcatg atcctctgtg gataagtgat tattcacatg gcagatcata taattaagga ggatcgtttg ttgtgagtga ccggtgatcg tattgcgtat aagctgggat ctaaatggca tgttatgcac agtcactcgg cagaatcaag gttgttatgt ggatatctac tggttttacc ctgcttttaa gcatagttat acacattcgt tcgcgcgatc tttgagctaa ttagagtaaa ttaatccaat ctttgaccca Movies Music Photos Maps Protein Shapes DNA 0010101001010101010100100100101010000010010010100.... Data Structures -> Data StructurING How do we organize information so that we can find, update, add, and delete portions of it efficiently? Niklaus Wirth, designer of Pascal Algorithms + Data Structures = Programs Data Structure Example Applications 1. How does Google quickly find web pages that contain a search term? 2. What’s the fastest way to broadcast a message to a network of computers? 3. How can a subsequence of DNA be quickly found within the genome? 4. How does your operating system track which memory (disk or RAM) is free? 5. In the game Half-Life, how can the computer determine which parts of the scene are visible? Dictionary ADT • Most basic and most useful ADT: • insert(key, value) • delete(key, value) • value = find(key) • Many languages have it built in: • Insert, delete, find each either O(log n) [C++] or expected constant [perl, python] • Any guesses how dictionaries are implemented? awk: D[“AAPL”] = 130 # associative array perl: my %D; $D[“AAPL”] = 130; # hash python: D = {}; D[“AAPL”] = 130 # dictionary C++: map<string,string> D = new map<string, string>(); D[“AAPL”] = 130; // map C++ STL • Data structures = “containers” • Interface specifies both operations & time guarantees Container Element Access Insert / Delete Iterator Patterns vector const O(n) Random list O(n) const Bidirectional stack const (limited) O(n) Front queue const (limited) O(n) Front, Back deque const O(n), const @ ends Random map O(log n) O(log n) Bidirectional set O(log n) O(log n) Bidirectional string const O(n) Bidirectional array const O(n) Random valarray const O(n) Random bitset const O(n) Random Some STL Operations • Select operations to be orthogonal: they don’t significantly duplicate each other’s functionality. • Choose operations to be useful building blocks. • push_back • find • insert • erase • size • begin, end (iterators) • operator[] • front • back • for_each • find_if • count • copy • reverse • sort • set_union • min • max E.g. Data Structure Operations E.g. Algorithms Ite ra to rs , S eq ue nc es Data Organizing Principles • Ordering: • Put keys into some order so that we know something about where each key is are relative to the other keys. • Phone books are easier to search because they are alphabetized. • Linking: • Add pointers to each record so that we can find related records quickly. • E.g. The index in the back of book provides links from words to the pages on which they appear. • Partitioning: • Divide the records into 2 or more groups, each group sharing a particular property. • E.g. Multi-volume encyclopedias (Aa-Be, W-Z) • E.g. Folders on your hard drive Ordering Pheasant, 10 Grouse, 89 Quail, 55 Pelican, 3 Partridge, 32 Duck, 18 Woodpecker, 50 Robin, 89 Cardinal, 102 Eagle, 43 Chicken, 7 Pigeon, 201 Swan, 57 Loon, 213 Turkey, 99 Albatross, 0 Ptarmigan, 22 Finch, 38 Bluejay, 24 Heron, 70 Egret, 88 Goose, 67 Albatross, 0 Bluejay, 24 Cardinal, 102 Chicken, 7 Duck, 18 Eagle, 43 Egret, 88 Finch, 38 Goose, 67 Grouse, 89 Heron, 70 Loon, 213 Partridge, 32 Pelican, 3 Pheasant, 10 Pigeon, 201 Ptarmigan, 22 Quail, 55 Robin, 89 Swan, 57 Turkey, 99 Woodpecker, 50 Sequential Search – O (n) (1) (2) (3) (4) Search for “Goose” Every step discards half the remaining entries: n/2k = 1 2k = n k = log n Binary Search O(log n) Big-O notation • O() notation focuses on the largest term and ignores constants - Largest term will dominate eventually for large enough n. - Constants depend on “irrelevant” things like machine speed, architecture, etc. • Definition: T(n) is O(f(n)) if the limit of T(n) / f(n), is a constant as n goes to infinity. • Example: - Suppose T(n) = 12n2 + n + 2 log n. - Consider f(n) = n2 - Then lim [T(n) / f(n)] = lim [12 + (1/n) + (2 log n) / n2] = 12 • Example: - Is T(n) = n log n in O(n)? - Check: lim [(n log n) / n ] = lim log n = infinity! So no! 0 25 50 75 100 125 150 175 200 225 2.5!10 4 5!10 4 7.5!10 4 1!10 5 1.25!10 5 Big-O Examples y = 3000 y = 50n y = 2n2 y = 40000(log n) y = 6723(√n) y = (0.00000001)2n Linking • Records located any where in memory • Green pointers give “next” element • Red pointers give “previous” element • Insertion & deletion easy if you have a pointer to the middle of the list • Don’t have to know size of data at start • Pointers let us express relationships between pieces of information. 2 24 43 78 97 Partitioning • Ordering implicitly gives a partitioning based on the “<“ relation. • Partitioning usually combined with linking to point to the two halves. • Prototypical example is the Binary Search Tree: 9 18 35 98 31 19 58 16 Find 18 All keys in the left subtree are < the root All keys in the right subtree are ≥ the root Implementing Data Structures in C++ • Remember: for templates to work, you should put all the code into the .h file. • Templates aren’t likely to be required for the coding project, but they’re a good mechanism for creating reusable data structures. template<class K> struct Node { Node<K> * next; K key; }; template<class K> class List { public: List(); K front() { if(root) return root->key; throw EmptyException; } // ... protected: Node<K> *root; }; Structure holds the user data and some data structure bookkeeping info. Main data structure class implements the ADT. Will work for any type K that supports the required operations (like <). Any Data Type Can Be Compared: • By overloading the < operator, we can define an order on any type (e.g. MyType) • We can sort a vector of MyTypes via: • Thus, we can assume we can compare any types. struct MyType { string name; // ... }; bool operator<(const MyType & A, const MyType & B) { return A.name < B.name; } vector<MyType> vec; // ...fill in vec with many MyType records... sort(vec.begin(), vec.end()); So, • Much of programming (and thinking about programming) involves deciding how to arrange information in memory. [Aka data structures.] • Choice of data structures can make a big speed difference. - Sequential search vs. Binary Search means O(n) vs. O(log n). - [log (1 billion) < 21]. • Abstract Data Types are a way to encapsulate and hide the implementation of a data structure, while presenting a clean interface to other programmers. • Data structuring principles: - Ordering - Linking - (Balanced) partitioning • Review Big-O notation, if you’re fuzzy on it. Resources • TA Office Hours: • Radu Dondera, rdondera@cs, Mondays 12:30-2:30pm. • Subhojit Basu, subhojit.basu@gmail.com, Wednesdays 2-4pm, AVW 1105. • My Office Hours: Tuesdays 3:30-5pm, AVW 3223. • Dave Mount’s Lecture Notes: • http://www.cs.umd.edu/~mount/420/Lects/420lects.pdf • Books: • Practical Introduction to Data Structures and Algorithm Analysis, C++ Edition, 2nd Edition by Clifford A. Shaffer. Prentice Hall, 2000. ISBN: 0-13-028-446-7. • Foundations of Multidimensional and Metric Data Structures by Hanan Samet. Morgan Kaufmann, 2006. ISBN: 0-12-369-446-9 (recommended). Assignments & Grading • Homeworks: • 5 homeworks, short answers, pseudo-code or English descriptions. • Goal: if you really understand the material, each homework shouldn’t take more than 2-3 hours. • PLEASE be neat with the write ups --- typed solutions make graders happy. In an unrelated note, happy graders probably tend to give higher grades. • Programming Project: • Multiple parts (2 or 3). • Must be done in C/C++. • Homeworks + Project: ~ 50% of final grade. • Exams: • Midterm (March 13, in class): ~ 20% of final grade • Final, comprehensive (according to university schedule): ~ 30% of final grade Assignments & Grading • Academic Honesty: All classwork should be done independently, unless explicitly stated otherwise on the assignment handout. • You may discuss general solution strategies, but must write up the solutions yourself. • If you discuss any problem with anyone else, you must write their name at the top of your assignment, labeling them “collaborators”. • NO LATE HOMEWORKS ACCEPTED • Turn in what you have at the time it’s due. • All homeworks are due at the start of class. • If you will be away, turn in the homework early. • Late Programming Assignments (projects) will be accepted, but penalized according to the percentages given on the syllabus.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved