# Data structures and algorithms

What are data structures and algorithms? Data structures refer to the way how data are organized and manipulated. The algorithms are pre-defined sequences that can solve the problem efficiently. The Data structures and algorithms overview let you browse the most important subjects in Computer Science. You can understand quickly with the illustrations.This is also the second episode of Landscape series in youtube. 1. Data structures

With data structure, we find ways to make data access more efficient. When dealing with data structure, we not only focus on one piece of data, but rather different sets of data and how they can relate to one another in an organized manner.

Array: An array is an index-based data structure, which means every element is referred by an index. An array holds same data type elements. Linked list: A linked list is a sequence of nodes in which each node is connected to the node following it. This forms a chain-link of data storage. It consists of data elements and a reference to the next record. Tree: A tree is a collection of nodes connected by edges. Each node points to a number of nodes. A tree represents the hierarchical graphic form. Binary tree: A binary tree has 1 or 2 nodes. It can have a minimum of zero nodes, which occurs when the nodes have NULL values. Binary search tree: A binary search tree (BST) is a binary tree. The left subtree contains nodes whose keys are less than the node's key value, while the right subtree contains nodes whose keys are greater than or equal to the node's key value. Moreover, both subtrees are also binary search trees. A binary search tree can retrieve data efficiently. Matrix: A matrix is a double-dimensioned array. It makes use of two indexes rows and columns to store data. Graph: A graph contains a set of nodes and edges. The nodes are also called vertices. Edges are used to connect nodes. Nodes are used to store and retrieve data. Stack: a stack is LIFO data structure in which only the top element can be accessed. The data is added by push and removed by pop on top. Queue: A queue is FIFO data structure. In this structure, new elements are inserted at one end and existing elements are removed from the other end. Max-Heap: A heap is a tree-based data structure in which all the nodes of the tree are in a specific order. Max-heap is a binary tree. It is complete. The data item stored in each node is greater than or equal to the data items stored in its children. Min-Heap: Min-heap is a binary tree. It is complete. The data stored in each node is less than the data items stored in its children. Trie: A Trie is a tree. In a trie, every node (except the root node) stores one character or a digit. By traversing the trie down from the root node to a particular node n, a common prefix of characters or digits can be formed which is shared by other branches of the trie as well. Suffix trie: Suffix trie is a trie containing all the suffixes of the given text. Suffix trie allows particularly fast implementations of many important string operations. 2. Java Collections

Java collection framework is the set of collection types that are included as part of core java. It provides the APIs or methods that you can use directly to operate on the data structures such as Arrays, Linked Lists, stacks, queues, sets and maps. If you master java collection, it will save you tons of time and helps to solve complex problems.

ArrayList:ArrayList class is a resizable-array implementation of the List interface. It implements all optional list operations and permits all elements. Vector:Vector is very similar to ArrayList but Vector is synchronised and slow. It is a legacy class and now it is comatible with collections.
String: String class is used to create and manipulate strings. LinkedList: LinkedList class is a doubly-linked list implementation of the List and Deque interface. A LinkedList stores its data as a list of elements and every element is linked to its previous and next element. HashMap: A HashMap is a collection class implements Map interface. It requires a hash function and uses hashCode() and equals() methods, in order to put and retrieve elements to and from the collection respectively. Hashtable: Hashtable class is similar to HashMap. It implements Dictionary. Hashtable provides an Enumeration of its keys. It doesn't allow null as key or value. Please note that since HashMap was created later, it is an advanced version and improvement on the Hashtable. Hashtable is synchronized and slower. HashMap is preferred over Hashtable.
TreeMap:TreeMap implements SortedMap interface. It is sorted in the ascending order of its keys. The complexity of the operations is O(logn). LinkedHashMap: LinkedHashMap keeps the inserting order. The complexity is the same as HashMap O(1). HashSet: HashSet class implements Set interface. Duplicate values are not allowed. Its elements are not ordered. NULL elements are allowed in HashSet. TreeSet: TreeSet is implemented using a tree structure. The elements in a TreeSet are sorted. The complexity of operations is O(logn). LinkedHashSet: LinkedHashSet maintains the insertion order. Elements get sorted in the same sequence in which they have been added to the Set. The complexity is the same as HashSet O(1). Stack: Stack class extends Vector class with five operations to support LIFO (Last In First Out). Stack internally has a pointer: TOP, which refers to the top of the Stack element. PriorityQueue:PriorityQueue class is the implementation of Queue, with each element having a priority associated with it. The elements of the priority queue are ordered according to their natural ordering, or by a Comparator provided at queue construction time. 3. Algorithms

An algorithm is a well-defined procedure that allows a computer to solve a problem. There are many algorithms. Here I list a few widely used algorithms in computer science: sorting, searching, recurive programming and dynamic programming.

Sorting: Sorting is an algorithm made up of a series of instructions that takes an array as input, performs specified operations on the array, sometimes called a list, and outputs a sorted array. The simple sorting algorithms are bubble sort, selection sort and insertion sort.
Bubble sort: It is the simplest sorting algorithm. We start at the beginning of the array and swap the first two elements if the first is greater than the second. Then we go to the next pair, and so on, continuously making sweeps of the array until it is sorted. O(n 2) average and worst. Selection sort: It is the most intuitive one, not necessarily efficient. Find the smallest element using a linear scan and move it to the front (swapping it with the front elements). Then find the second smallest and move it, again doing a linear scan. Continue doing this until all the elements are in place. Good for small files. O(n 2) average and worst. Insertion sort: It sorts the array by shifting elements one by one. Each iteration removes an element from the input data and inserts it into the correct position in the list being sorted. It is efficient for smaller data sets, but very inefficient for larger lists. It is better than Selection Sort and Bubble Sort algorithms. O(n 2) average and worst. Search: Search is to find the content based on the key. There are linear search and binary search.
Linear search: Linear search is a method for finding a target value within a list. It sequentially checks each element of the list for the target value until a match is found or until all the elements have been searched. Binary Search: Binary search is an efficient algorithm for finding an item from an ordered list of items. It works by repeatedly dividing in half the portion of the list that could contain the item; until you narrow down the possible locations to just one. The complexity reduces from O(n) to O(logn). Recursion: Recursion is a computer programming technique that a function or an algorithm calls itself. It should include a step with a termination condition. When the condition is met, the rest of each repetition is processed from the last one called to the first. The most famous problem solved by recursion is Factorial number.
Factorial number: Factorial of a number n is product of all positive non-zero numbers less than or equal to n. The factorial of n is denoted by n!. Dynamic programming: Dynamic programming is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions. The next time the same subproblem occurs, one looks up the previously computed solution, thereby saving computation time at the expense of a modest expenditure in storage space. The famous dynamic programming problem is Fibonacci numbers.
Fibonacci numbers: They are a series of numbers in which each number ( Fibonacci number ) is the sum of the two preceding numbers. The simplest is the series 1, 1, 2, 3, 5, 8, etc. Divide and conquer: A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The famous problems using divide and conquer are merge sort and quick sort.
Merge sort: Divides the array in half, sorts each of those halves, and then merges them together. Each of those halves has the same sorting algorithms applied to it. Eventually, it merges two single-element arrays. O (nlogn) average and worst. Quick sort: Picks a random element and partitions the array, all numbers that are less than the partitioning element come before all elements that are greater than it. If we repeatedly partition the array around the element, the array will eventually become sorted. But as the partitioned element is not guaranteed to be the median, our sorting could be very slow. O(nlogn) average, O(n 2) worst. Greedy: Greedy algorithm makes the choice that seems to be the best at that moment, ie makes a locally-optimal choice in the hope that this choice will lead to a globally-optimal solution. The famous problem solved by greedy algorithms is Huffman coding.
Huffman coding: Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. 