WWW
World Wide Web is an information system that supports specially formatted documents and it provides the path to connect to content anywhere in the world.
HTML
Hypertext Markup Language is the “hidden”code that helps us communicate with others on the World Wide Web.
Browser
Browser is a software (Internet Explorer, Firefox etc.) used to access the Internet.
Servers
Servers are computers that provide data to other computers HTML Markups and Tags.
Tags
- < Opening Tag
- > Closing Tag
- <b> Bold
- <em> Emphasis; Italics
- <br> Line Break
- <p> Paragraph
- <a> Anchor Tag
Void Tags
Void Tags don’t require a closing tag. Examples include:
Elements
An HTML Element is everything from the opening tag to the closing tag.
Attributes
HTML links are defined with the tag. The link address is specified in the "href" attribute. To find out more about the href attribut go to this link.
Inline and Block Elements
The div element is a block-level element. A block-level element creates an “invisible box” around the content enclosed within the opening and closing tags (<p>, <div>).
The span element is an inline element that allows you to separate things from other elements around them within a document and it does not cause a line break (<b>, <em>, <img>, <br>, <span>).
HTML, CSS and Java are all languages.
HTML is responsible for the structure of the webpage.
CSS controls how the webpage looks (style) and Java referrs to the interactive components of the webpage.
HTML vs CSS
HTML is the language for building webpages. This language has a specific syntax and specific rules. The basic word in html is "tag" and the internet browser turns these tags into elements that form a tree.
CSS (Cascading Style Sheets) allows to use specific syntax and rules to change how the elements look on the page.
DOM
DOM (Document Object Model) is how the browser interprets the HTML files. DOM translates the elements in HTLM text document into elements in tree-like structure.
Boxes
Webpages are simply seen as boxen inside other boxes. Boxes consist of title, content and style.The content can consist of images, headers, text as well as other elements.
Text Editors
- Sublime Text 2 is a text editing program. It is a little difficult at the beginning but contains all the languages.
- Scratchpad is beginner friendly, it has an immediate speed of feedback and is designed for smaller projects.
- Codepen allows you to very easily share your work using three languages (HTML, CSS and Java).
Avoiding Repetition
When using CSS, one of the most common errors that web designers come across is the repetition. Using a lot of repetition can lead the programmer to make a lot of errors.
CSS
CSS (Cascading Style Sheets) is a code written to control the “style” of HTML elements. There are several CSS files or sheets that will be referenced in HTML. Cascading means that the most specific rule will be applied to every element.
Inheritance is a key feature in CSS and it relies on the ancestor-descendant relationship to operate. Click on the link to read a more detailed description of Inheritance.
Box Sizing and Positioning
All HTML elements can be considered as boxes. In CSS, the term
"box model" is used when talking about design and layout.
The Box consist of 4 elements:
- Content - is the nucleus and it could be an image or text that appears on the website.
- Padding - it clears the area around the content and it is the space inside the box that protects the content directly.
- Border - it goes around the padding and the content. The border is inherited from the color properties of the box.
- Margin - is the outer layer. Margin doesn’t have the background color and it is completely transparent.
Considering that there are so many elements to each box and so many boxes in an HTML document, it could be difficult sometimes to size and position the box just right.
There are two ways to deal with Box Sizing. One way is to use percentage as opposed to pixels to set the size of the box and the second way is to utilize the box sizing border`s box attribute.
When it comes to Box Positioning, DIV is known as block-level element and it takes up the whole space. You can think of it as a whole block that prevents other elements to share the same space. Adding the rule "display: flex;" to the appropriate CSS will override the behavior and let DIVs appear next to each other.
Computer Sience
Computer Science is about how to solve problems like building a search engine, by breaking it into smaller pieces and precisely and mechanically describing a sequence of steps, that you can use to solve each piece. Those steps can be executed by the computer. A lot of machines (like a toaster) are designed to do few things. These machines will be able to do only these few things, unless you alter them physically. Computers are different. We can program computers to do anything we want, as long as we are able to write a program that specifies a specific sequence of instructions.
Computer Program
A computer Program is a list of instructions that tells the computer what to do. The Program has to be a very precise sequence of steps. The power of the computer helps to execute these steps very fast. Programs are the core of computer programming and the computer is basically useless without them.
Python
Python is a high level computer language that we can use to write programs. Python is called an Interpreter, it runs the programs , interprets them and executes them. Python emphasizes code readability.
Computer Language
A Computer Language is a language that is designed to communicate instructions to a computer. The “Normal” language can not be used in programming, because it is ambiguous and verbose. Programming language has a grammar just like the human language. A grammar specifies what is “correct” and what is “incorrect”. The main difference between these two types of grammar is that the program’s grammar is not forgiving any mistakes, whereas you can still easily understand a spoken language with few mistakes.
An example of Computer Language is the Backus-Naur Form. John Backus was the lead designer of Fortran Programming Language (1950’s). The purpose of BNF is to be able to precisely describe exactly the language in a way that it is very simple and very precise.
Python Expressions
In Python an
“expression” is a statement that tells the computer to perform a specific function. In Python the coding must match the language grammar exactly. It will fail if it is not written correctly.
For example:
print 2 * 2 + 6 is a valid expression
print 2 * 2 + is not a valid expression
Difference between 1 and 1.0 in Python
If you are getting 0 when running your code, try changing the values to decimals.
When you divide an integer by another integer, Python will do something called an integer division, where it ignores the decimal part of the answer.
For example, 3/2 will give you 1instead of 1.5.
In order to force Python to give you an answer with the decimal, you will need to make one of the numbers into a decimal by putting a decimal point after it.
For example, 3/2.0 will give you 1.5.
As the name implies, a
Variable is something that can change. Variable names in Python are case sensitive. Variables give programmers a way to assign names to values.
It is very useful to assign a value to a variable. This helps a lot when dealing with huge amount of confusing and meaningless numbers. We can assign the value
5 to the variable
my_age by writing this code:
my_age = 5. The value of the variable can be re-assign or changed into a different value at any point.
Variables can be useful to programmers in many ways because:
- they help to improve code readability,
- they give us a way to store the value of important data,
- they give the programmer a way to change the value of a variable.
Strings
Strings are a sequence of characters surrounded by quotes that are encased either between opening single or double quote sign and ending single or double quote sign.
The Different Meaning of =
= means an assignment rather then an equal sign. You should think of it as an error, where you should put whatever value the right side evaluates to, into the name on the left side.
What is Function?
Functions take input, do something to it and then they produce output.
Difference between making and using functions
Function start with a keyword def and it is followed by the function parameters in parentheses. When the function is used, these parameters will be later replaced by actual values.
The following code could be the definition of a function called "square"
For example:
def square(n)
return n * n,
print square(5)
>>>25
How Functions Help to Avoid Repetition
When programmers create a function once, they never have to define it again and they can reuse it forever.
What if a Function doesn’t have a return statement?
The return tells Python what function it should produce as an output. If a function doesn't have the return statement, then there will be no output and nothing will happen.
Making Decisions
Python provides different operators that we can use in comparison, such as <, >, != (not equal to) and many more. These signs operate on numbers and any number can be used for comparison. We use == to compare and = means an assignment. The output is a Boolean number that is either True or False.
For example:
print 1<2
result True
print 2>1
>>>False
If and Else Statements
If statements are used in order to control which code will get executed when conditions are met and while loops are used to perform repetitive tasks.
For example:
def bigger(a,b):
if a>b:
return a
else:
return b
While Loops
While loop make code that performs the same task many times. It will keep going as long as the text expression is True and a False expression stops it. A loop that never ends is called an “infinite” loop.
For example:
def print_numbers(n):
i=1
while i<=n:
print i
i=i+1
print_numbers(3)
>>> 1,2,3
Breaktime!
The Break statement gives us a way to stop the loop if it is True.
For example:
def print_numbers(n):
i=1
while True:
if i > n:
break
print i
i = i + 1
Debugging
Debugging is a way to locate and correct errors.
- Examine error messages when programs crash
The last line of Python Tracebacks tells you what line it crashed on, what file it was running and how it got there.
- Work from example code
After you modified your code and it still does not work, comment it out and do step-by-step modifications to the example code until it does what you want.
- Make sure examples work
Check the example code that you are using in order to make sure that it works as you want it to.
- Check print intermediate results
When your code does not crash, and yet doesn't work as you want it to, add print statements to your program to see where in the code things stop working correctly.
- Keep and compare old versions
When you have a working version of your code, save it before you add anything to the code. This will give you something to go back to.
Nested Lists
Nested lists are elements that can be anything (characters, strings, numbers, other lists, etc.) and they can hold anything we want. Lists are more powerful than strings.
For example:
beatles = [[‘John’, 1940], [‘Paul', 1942], [‘George’, 1943], [‘Ringo’, 1940]]
print beatles
Mutation
Unlike strings, lists support mutation. Mutation means changing the value of an object so we can change the value of a list after we created it. All the other things that we have seen so far (numbers, strings and tuples) are immutable so their contents cannot be changed.
Aliasing
Aliasing can be thought of as two names that refer to the same object. Aliasing is very useful but at the same time it is also very confusing. Any changes made to the state of one object will affect the state of the object for all names that refer to that object.
List Operations
There are many build-in operations on lists and some of them include:
- Append - The append method adds a new element to the end of a list. This method mutates the list that is invoked on but it does not create a new list.
For example:
stooges = [‘Moe’, ‘Larry’, ‘Curly’]
stooges.append(‘Shemp’)
print stooges
>>>[‘Moe’, ‘Larry’, ‘Curly’, ’Shemp’]
- Concatenation - The + operator can be used with lists and it is similar to how it is used to concentrate strings. It produces a new list and it does not mutate either of the input lists.
For example:
[0,1] + [2,3]
>>> [0,1,2,3]
- Length - This method uses the len operator to find out the length of an object. The len operator works for many things other than lists, and it works for any object that is a collection of things including strings. The output from len is the number of elements in its input.
For example:
len[(0,1)]
>>> 2
When you use the len operator on a string, the output is the number of elements in the string.
For example:
len(“Udacity”)
>>> 7
Loops on Lists
For loop is the second type of loop that is frequently seen in Python. The loop goes through each element of the list in turn, assigning that element to the <name> and evaluating the <block>.
for <name> in <list>:
<block>
For example:
def print_all_elements(p):
for e in p:
print e
Index
Index is another way to define find_element and it is a build-in list operation that makes it easier to write find_element.
The index method is invoked on a list by passing in a value, and the output will be the first position where that value sits in the list, otherwise it will produce an error message.
<list>.index(<value>) -> <position> or error
For example:
p = [0,1,2]
print 3 in p
>>>False
print 1 in p
True
Object Oriented Programming
Object Oriented Programming (OOP) does not only structures programs in a better way, but it also helps structure the programming tasks. As programs become bigger and more complicated, the problem of managing the programs also increase. Object-Oriented approach offers ways of dealing with this complexity, not just in design, but also in the organization of the work. It helps with things such as collaboration, design, separating the interface from the implementation, keeping the interface simple, dividing the work into modules, reusing tested codes, inheriting generic code, and making decisions dynamically.
Modules allow to organize the python code. Grouping related code into a module helps in making the code easier to understand as well as to use. In Python a
Module is an object with arbitrarily named attributes that you can bind. Simply put, a
Module is a file consiting of Python code, that can be defined as functions, classes and variables.
There are many reasons as to why Object-Oriented Programming is important:
- Object-Oriented Programming (OOP) is a programming paradigm that uses abstraction to create models based on the real world.
- Object-Oriented Programming uses several techniques from previously established paradigms, which allows re-using tested codes over and over.
- Object-Oriented Programming envisions software as a collection of cooperating objects rather than a collection of functions or simply a list of commands (as is the traditional view).
- In Object-Oriented Progrmming, each object can receive messages, process data, and send messages to other objects. Each object can be viewed as an independent mini machine with a distinct role or responsibility.
- Object-Oriented Programming promotes greater flexibility and maintainability in programming, and is widely popular in large-scale software engineering.
- Object-Oriented Programming strongly emphasizes modularity, object-oriented code is simpler to develop and easier to understand later on. Object-oriented code promotes more direct analysis, coding, and understanding of complex situations and procedures than less modular programming methods.
OOP Vocabulary
- Class
A class is like a blueprint or template or set of instructions to build a specific type of object. Every object is built from a class and each class should be designed and programmed to accomplished one thing. Since each class is designed to have only one responsibility, many classes are being used in order to build an entire application.
- Instance
An instance is a single and unique unit of a class. An instance is a specific object built from a specific class. It is assigned to reference variable that is used to access all of the instance's properties and methods.
- Instance Variables
Instance Variables are all variables associated with the specific instances. They are unique to an object and can be accessed using the keyword “self” inside the class and the instance name outside the class.
- Instance Methods
Instance Methods include all functions that are inside the class that are associated with an instance and have the first argument as “self”.
- Object
An object is a component of a program that knows how to perform certain actions and how to interact with other elements of the program. Objects are the basic units of OOP.
- Method
A method defines the behavior of the objects that are created from the class (a method is an action that an object is able to perform).
- Library
Python already comes with a library and it is called the Python Standard Library which is a collection of modules that covers all of your basic needs. However, for more complicated tasks people often need to create their own tools and modules. That collection of tools or modules ends up being a library.
- Class Variable
A class variable is a variable that is defined in a class of which a single copy exists, regardless of how many instances of this class exist.
- Function A function is a combination of instructions that are combined to achieve some results. A function is independent and not associated with a class. You can use functions anywhere in the code and you do not need to have an object to use it.
- Inheritance
Inheritance is specific to OOP where a new class is created from an existing class. The class being inherited from is called the parent class or superclass. The class that is inheriting is called the child class or subclass. Inheritance comes from the fact that the subclass (the newly created class) contains the attributes and methods of the parent class.
The main advantage of inheritance is the ability to define new attributes and new methods for the subclass which are then applied to the inherited attributes and methods. This can be used to create a highly specialized hierarchical class structure. The biggest advantage is that there is no need to start from scratch when wanting to specialize an existing class. As a result, class libraries can be purchased, providing a base that can then be specialized at will.
- Overriding
If a method inside a subclass is the same as the one within the parent class, the new method will override the parent one. The override method in the child class should have the same name, signature, and parameters as the one in its parent class.
- Abstraction
Abstraction allows us to focus on the program we actually want to write and to use modules within the Python Standard Library without knowing the details of the programming in those modules. This can help when encapsulating functionality of an object because it can help identify the important information that should be made visible and the unimportant information which can be made hidden. Abstraction also helps with the “do not repeat yourself principle”. By taking what a group of objects have in common and abstracting them, we can help prevent redundant code in each object which in turn creates more maintainable code.
Abstraction means working with something we know how to use without knowing how it works internally. Abstraction is something we do every day. This is an action, which obscures all details of a certain object that do not concern us and only uses the details, which are relevant to the problem we are solving.
Abstraction is one of the most important concepts in programming and OOP. It allows us to write code, which works with abstract data structures (like dictionaries, lists, arrays and others). We can work with an abstract data type by using its interface without concerning ourselves with its implementation.
Network
A Network is a group of entities (computers, people, organizations) that can communicate, even though they are not all directly connected. There has to be at least 3 entities involved to call it a network and there hast to be a way for them to communicate, even though they are not directly connected.
Latency and Bandwith
Two main ways to measure a network are called latency and bandwidth.
Latency is the time that it takes for a message to get from the source to the destination. It is measured in seconds and for a fast network today it is measured more often in miliseconds.
Bandwidth is a measure of the amount of information that we can send and we are measuring the number of bits. In order to measure bandwidth, we need to know how many bits can we send per second. You can click on this link to go to one of the websites that allows you to measure the bandwidth that you are getting over your Internet connection.
Bit
Protocol is a set rules that people agree to and that tell you how two entities can talk to each other. For the web, the protocol gives rules about how a client and a server talk to each other. The protocol says that if you want to get the server to do something (web server), the client (web browser) has to send a message in a particular way.
Hypertext Transfer Protocol (HTTP) is the protocol that we use on the web and when you look in your browser, you will see that almost all URLs start with HTTP. This means that the protocol that you should use to talk to the server, that you are requesting a document from, is the HTTP protocol.
Get message means that the client sends a message to the server. The message says get, and then the name of that object that you want to get. In the result, the server will find the requested file in some kind of library, then run some more code to get the result, and it will send back a response which is the contents of the requested object.
URL
- URL stands for Uniform Resource Locator and it is an unique address for a file that is accessible on the Internet.
- Protocol is an indicator of what overarching framework is being used to transmit data back and forth. Far and away, the most commonly used is HTTP.
FTP (File Transfer Protocol) is another fairly common format, which is used more for “file”-related data. The protocol is followed by :// symbols.
- Host is basically the domain name of the server that has the document that we want to access. This can also be an IP address, which is the description of the location of the physical machine that has the document that we want to fetch.
- Path identifies the specific resource in the host that the web client wants to access.
Query Parameters
Query parameters are also known as get parameters. If a query string is used, it follows the path component, and provides a string of information that the resource can use for some purpose (for example, as parameters for search or as data to be processed). The query string is usually a string of name and value pairs.
The format of the query parameter looks like this:
http://example.com/foo?p=1&q=neat
- p (name) = 1 (value - it can be anything)
- the first query parameter is separated from the URL using ? mark
- all of the other parameters are separated from each other using &
A
cache is something that stores data so that you don’t have to retrieve it later. It can be used to make data requests faster. For more information, this page has a good explanation.LINK!!!!!!!
Another piece of URL is called a
fragment. A fragment is separated from the rest of the URL by a
# sign. A fragment is generally used to reference a particular part of the page you are looking at. When a fragment is used in the URL, it is not sent to the server when you make a request.The fragment purely exists in the browser. When there are query parameters in the URL, the fragment follows the query parameters and it comes last.
In order to make an internet connection, you need two things:
- the address of the machine (which is represented by the host)
- a port (by default the port equals 80 but if you want to use a different port, you can include it in the URL between the host and the path separated by a colon)
HTTP
HTTP is what the browser uses to talk to the Web servers. The request from the browser for the URL www.example.com/foo begins with a request line and it will look something like this:
- Method is what type of requests we are making to the server. The most common method is GET. Another popular method is POST
- The Path is the actual document that we are requesting from the server
- the Version is always http/ and then a version number. Most browsers and servers these days speak at 1.1
HTTP Responses
A basic HTTP response looks very similar to the request. For a basic request
GET/foo HTTP/1.1, the response will be
HTTP/1.1 200 OK. Some of the common status codes include:
- 200 OK - it means that the document is found
- 302 Found - the document is located somewhere else
- 404 Not Found - the document wasn’t found
- 500 Server Error - the server broke trying to handle your request
There are two main classifications for the type of responses that a server will make:
- Static - pre-written file that server just returns; ex: img
- Dynamic - response is built on fly by the program that is running. Just about all the content online today is dynamic. Almost every website today including Udacity, Reddit, Hipmunk, Wikipedia and Facebook has build their pages dynamically. These pages built on the fly are built by programs called web applications. A web application is just a program that generates content. It lives on a web server, it speaks HTTP, and it generates content that your browser requests.
Forms
Forms are all over the place on the internet. They are the text boxes, check boxes, radio buttons that basically allow websites to get information from users. On
Udacity.com, every quiz that you submit is a form.
Within the form object, an input will add an input box to your site. There are several options that can be used with the <input>:
- adding <input name = "q"> gives you an input box; pressing enter will add the content of the box to the URL string
- adding <input type = "submit"> gives you a submit button; pressing the submit button will add the content of the box to the URL string
- the input type password shows dots in the input box while being typed but will show the actual characters typed at the URL (this is a security concern)
- The input type checkbox give you a check box and when the submit button is depressed, it adds the variable = on; depressing the submit button without checking the box makes the parameter not appear at all
- the radio type gives you radio buttons that behave like conventional radio buttons, they need to have the same name. A value parameter in the <input> will differentiate the radio buttons in the URL. Adding a label element to the form will add a name to the radio buttons on the web page.
- the select element gives you a dropdown box; adding the value parameter to the select option parameter allows you to have the selection box show the user one name or text while sending a different value to the server or program.
The Modulus Operator
The Modulus Operator is written with a % sign. It takes a number and it maps it to the range based on the remainder when you divide that number.
The syntax is as follows:
<number> % <modulus> -> remainder
In the modulus operation the second number is the limit that the first number is being divided by to see if there is a remainder.
For example:
15 % 12 is 3
If the second number is larger than the first the answer is the first number because there would be no remainder.
For example:
3 % 4 is 3
Dictionaries
Dictionaries are sets of
key : value pairs in
{} brackets. They are mutable.
d[k] where k is a key value associated with k in d. To replace the value of k to v:
d[k] = v.
The syntax looks like this:
elements = { ‘hydrogen’: 1, ‘helium’: 2, ‘carbon’: 6 }
- Unlike a list, where the elements are ordered, with a dictionary there is no order to elements
- When you try to look up a value and you get a KeyError, it means that this value is not in the dictionary
- you can use an assignment to add an element to a dictionary:
elements[‘nitrogen’] = 7
Google App Engine lets you build and run applications on Google’s infrastructure. App engine applications are easy to create , easy to maintain and easy to scale as you traffic, and data storage needs change. With App engine, there are no servers for you to maintain. You simply upload your application and it is ready to go.
Differences Between GET and POST
GET:
- Parameters are included in the URL
- These requests are often used for fetching documents
- Parameters are affected by the maximum url length
- Parameters are ok to cache
- Should not change the server
POST:
- Parameters are in the body
- Often used for updating data
- Essentially, no max length
- Parameters are not ok to cache
- Ok to change the server
GET requests should be simple requests for fetching a document. GET parameters should be used to describe what document you are looking for. POST parameters are used for making updates to the server.
Validation
Validation means verifying on the server side that what we received is what we expected.
Web applications are vulnerable if you don't practice input validation. Validating user input could prevent application attacks such as buffer overflow, SQL injection and cross-site scripting. Proper validation of form data is important to protect your form from hackers and spammers!
Purposes of Data Validation:
- If a user submits data that is not within the allowed values or it is in the wrong format, it may cause the application to exhibit unexpected behaviour – which may include a blank screen or a screen that doesn’t make sense. Validation allows for this to be prevented, and instead to present a human-readable error message back to the user. Allowing the user to see why the input wasn’t accepted greatly aids in usability of the application.
- A malicious user of the application may attempt to exploit problems in the application by sending data that is not in the format that the application expects or outside the range of values that a user should be using. The value used, if not checked, may grant the user access to some aspect of the application otherwise hidden, due to an internal problem in the application.
Data validation can help to ensure that data stored is complete and that nothing is missing. For instance, ensuring that ‘required’ fields are indeed filled out by the user ensures that there won’t be gaps (or empty strings) in a database record, which may cause problems with the incomplete data is acted upon later, for instance to follow up with a customer.
Website link
Python is a very powerful programming language. It can use functions to take in user inputs, draw web pages with strings of HTML that are rendered right from the Python file, and store user data in the URL or via the use of hidden inputs. While all these varied functions seem like an ideal way to create web pages, it has limitations.
HTML embedded in code is messy and difficult to maintain. It's better to use a templating system, where the HTML is kept in a separate file with special syntax to indicate where the data from the application appears. Google App Engine includes the Django and Jinja2 templating engines. Using HTML templates allow programmers to avoid repetition, save time and make less mistakes.
Templates Inheritance
Most web pages have a title, a footer, and some content in the middle. You might have another page that has the same title, footer but a different content. to avoid the repetition of the code, you can create a base template that can be referenced by other templates to add common elements (header, footer). Instead of placing a header at the top of 6 HTML pages, you can create one file with the header and extend it to all the other files. All changes that need to be done to the header, can be done in one file and they would affect the other 5 HTML pages. Inversely, if the content needs to change daily, this can be done without effecting the Header file.
Benefits of Using Templates:
- Separate different types code - you can keep your HTML separate from your PYTHON, which will make things easier and keep your code more clean
- Make more readable code - a better organized code is more readable
- More secure websites - if you use the auto escaping feature of jinja or other templates, you don’t need to worry about users putting insecure or malicious data into your website
- HTML file that is easier to modify - instead of trying to modify strings in a Python editor, you have HTML in your HTML editor that is much easier to handle
- Avoid code repetition - templates can help avoid re-writing HTML code many times, for example, by automating the insertion of elements like divs or by using templates inheritance. This makes the code more efficient, saves time and effort, and reduces errors.
Click on this LINK!!!!! to learn more about Jinja.
A database program is a type of computer software that is designed to handle lots of data, but to store them in such a way that finding (and thus retrieving) any snippet of data is more efficient than it would have been if you simply dumped them randomly all over the place. With such database software, if you (say) keep a list of customers and their shipping addresses, entering and retrieving information about your one millionth customer will not take much longer (if at all) than entering and retrieving information about your 1st customer.
Types of Databases
- Relational databases - often uses a language called SQL to manipulate them (Postgresql, MySQL, sqlite, Oracle). These databases all work with tables
- Google App Engine's Datastore (Dynamo - Amazon uses it, NoSQL - Mongo, Couch)
None of these databases is considered the best. They all are different and they all have their place in solving different types of problems. It is not uncommon for large websites to use multiple different database products.
SQL
SQL stands for Structured Query Language and it is a language for expressing queries. SQL was invented in the 1970’s, long before the internet, the web or web applications. SQL looks something like this:
Joins
Joins are a type of SQL query that involves multiple tables. A join query will look like this:
Indexes
An index is just like an index in a book. It increases the speed of queries and the speed of database reads. Indexes are very useful in making reads simpler. There is a maintenance cost to having them in, because you have to keep them up to date when updating the rest of your database. If you have multiple indexes in your table, each time you update the table, you also need to update each of the indexes. Indexes increase the speed of database reads but they decrease the speed of database inserts.
Indexes for Sorting
ACID stands for:
- Atomicity - all parts of a transaction succeed or fail together. A transaction is just a group of commands. Maybe we're updating two tables at once, or updating multiple rows in our database together. Atomicity means that all those commands are going to succeed or they're all going to fail. We don't do just a part of the transaction.
- Consistency - the database will always be consistent. On Reddit, a link has a score, and the user submitting the link has a karma score that gets updated at the same time. Consistency means we don't update the score without also updating the karma.
- Isolation - no transaction can interfere with another's. Lets say a link on Reddit gets an up and down vote at the same time. Isolation means the computation of the up vote will not interfere with the computation of the down vote. Each transaction cannot affect other transactions. Sometimes this is accomplished through locking, but the database needs some way to resolve these types of conflicts.
- Durability - once a transaction is committed, it won't be lost. This means if we update our database and update some rows, and the database says ok, successful, that even if the database is turned off or crashes or is unplugged, we won't lose that transaction.
It is difficult to build a database that is fully atomic, consistent, isolated and durable. There are always trade offs involved.
Google App Engine
This is the database provided by App Engine. What you have been referring to as tables are known as entities in Datastore. They serve basically the same purpose, which is how you organize things of the same data type together.
Some important things about entities:
- columns are not fixed. In a table you have a fixed number of columns that you define when you create a table. In an entity, you can have whatever columns you want, even entities of the same type don't have to have the same columns. This makes development much easier. When you're working with tables, you often realize you need to add a column, which can be very troublesome. With Datastore, you can change the columns as you are developing
- entities all have an ID. The ID can be assigned automatically, or you can make up your own, integers, strings, whatever you want
- entities have parents and ancestors. This is a relationship to other entities with a couple of specific use cases.