DEVENDER PALSA
7 min readJan 29, 2020

My Learning_SQL1

SQL (Structured Query Language) is a common tool for retrieving data from relational databases such as MySQL, SQL Server, MariaDB, and PostgreSQL.
Together with R and Python,
SQL is one of the top skill in mastering Data Science.

SQL?
SQL stands for Structured Query Language, it’s a language for manipulating and talking about data in databases but was originally called SEQUEL for Structured English Query Language.

SQL is the standard language for Relational Database System and all the Relational Database Management Systems (RDMS) like MS Access, , Microsoft Transact-SQL or T-SQLMySQL, SQLite, Oracle, Postgres, SQL Server, Sybase and Informix use SQL as their standard database language.

SQL is followed by a unique set of rules and guidelines called Syntax.

We need to understand what are the basic parts of a statement to start building SQL statements.
In overall, something we write in SQL to get an answer from a database, or to make a change to it is called a statement.

All the SQL statements start with any of the keywords like SELECT, INSERT, UPDATE, DELETE, ALTER, DROP, CREATE, USE, SHOW and all the statements end with a semicolon (;).

A statement is made up of Clauses and Clauses are the basic components of statements that make up the whole and these clauses are constructed out of keywords, which tell to the the database to take some action.
Predicates which specify what information we’re working with and Predicates include a value, or condition called an expression.

Note: SQL is case insensitive which means SELECT and select have same meaning in SQL statements.

Database
A database is a collection of information. If we think about a list of people, city they live in and their favorite color. In this list we have three kinds of information, a name of person, name of a city, and a color they like.

In any database, information is organized into columns, and rows. The columns are known as fields and the rows are called records.
If we think about a record like a card with an each person’s information on it. All the cards have same spaces for information, all have the same field but each card related to one person and each one is an individual record and together, records and fields make up a table.

Database Schema: We can use a spreadsheet like Excel to store our data but databases allow us to add more tables and also we can set up rules and relationships between the tables.
A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated(The definition and layout of how fields, tables and relationships are set up known as the schema of the database).

Ask for data from a Database with SELECT statement

The most basic way is to use a select statement.
The SQL SELECT statement is used to fetch the data from a database table which returns this data in the form of a result table. These result tables are called result-sets.
The select keyword inform the database that we want some information returned to us. For example:

Select ‘Hello, World!’;

Narrow down a query with WHERE
When we are asking the database for information, it is very helpful to be able to give some parameters and we can use the WHERE keyword to add a condition to a SELECT statement.

SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE SALARY > 2000;

We can add more criteria to where statement.

The SQL AND & OR operators(the conjunctive operators) are used to combine multiple conditions to narrow data in an SQL statement.

Limit responses using Like, Limit and OFFSET

SELECT first_name, last_name, state FROM People WHERE first_name LIKE ‘%D’; — which ends with J
SELECT first_name, last_name, state FROM People WHERE first_name LIKE ‘D%’ LIMIT 5;which starts with J

We can tell the database that how many responses we would like to see using the LIMIT keyword. LIMIT tells the database to stop returning results after a given number of them have met our search criteria.

SELECT first_name, last_name, state FROM People WHERE first_name LIKE ‘D%’ LIMIT 10;

If wanted to see a specific range of them, like the second set of five responses, these here, Antonio West through Rachel Robinson, I can use the OFFSET command to tell the database to skip some records before counting off my five.

SELECT first_name, last_name, state FROM People WHERE first_name LIKE ‘D%’ LIMIT 10 OFFSET 5;

Organize the responses with ORDER BY

We can use the “order by” keyword to sort data. The SQL ORDER BY clause is used to sort the data in ascending or descending order based on one or more columns.

SELECT first_name, last_name, state FROM People ORDER BY first_name DESC;

Find information about the data with LENGTH, DISTINCT and COUNT

SELECT LENGTH(first_name) FROM people;
SELECT DISTINCT(first_name) FROM people;
SELECT COUNT(*) FROM people WHERE state=’CA’;

Understand JOIN and JOIN types

The SQL Joins clause is used to combine records from two or more tables in a database. A JOIN is a means for combining fields from two tables by using values common to each.

In one table I have information about some people and in another states table, I have some information about states in the U.S. I have their name, their abbreviation and what census region and division they are part of.
We can join these both tables:

SELECT first_name, state FROM people
JOIN states
ON people.state=states.state_abbrev;

SELECT * FROM people
LEFT JOIN states
ON people.state=states.state_abbrev;

SELECT * FROM people
RIGHT JOIN states
ON people.state=states.state_abbrev;

Combine results with GROUP BY:
When we need to divide results in various ways, we can use the SQL GROUP BY clause in collaboration with the SELECT statement to arrange identical data into groups and this GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.

SELECT first_name, COUNT(first_name) FROM people GROUP BY first_name;

Data types in SQL:
SQL Server offers six categories of data types for your use which are -Numeric, Character Strings, Date and Time and Binary.

SQL has a few categories of data types- Numeric, Character Strings, Date and Time and Binary.
We should be aware of and dealing with binary information with dates and times, with numbers and with text.

The number types represent different kinds of numeric values including integers of different lengths, floating points of various lengths and precisions and so on.

The text types represent different lengths of characters treated as text rather than as numbers or binary data so we usually use VARCHAR for variable character length.
To store the number of points that someone earned on a quiz, we would probably use an integer type, because we aren’t concerned about decimal points there.

We use a date type to store calendar date, to store a true/false value we use a binary/BOOLEAN .
Null is a value represents nothing in a particular field and is a value different than zero/different than no/false and boolean.

Note: null is not zero and null is not no.

Math in SQL
It is a good idea not to rely on SQL math for financial, or scientific applications where high precision is necessary.
-MAX, MIN, SUM, COUNT
SELECT 4+2;
SELECT 1/3;
SELECT 1/3.0;
SELECT 2>3;
SELECT 2❤;
SELECT 5!=3;
SELECT 3=3;

Compound Select
When we need to use more than one SELECT statement to get the information and we can use a secondary SELECT statement called a sub-query/ a sub-select and it helps to narrow down or zero in on a specific set of data to use in a larger query.
Let’s get a list of all the people in our database who achieved the highest score on our quiz.

SELECT first_name, last_name, quiz_points
FROM people
WHERE quiz_points = (SELECT MAX(quiz_points) FROM people);

We could also use a sub-query to use information from one table to inform a search on another table. Let’s say we want to find all of the people form Minnesota but you never remember what the state abbreviation is for Minnesota, is it Mi/Ma/Mn? and there are so many similars. Instead of specifying a state name abbreviation, let’s look up the name Minnesota in the states table, get abbreviation and then use that to look up people.

SELECT * FROM people
WHERE state= (SELECT state_abbrev FROM states WHERE state_name= ‘Minnesota’);

Transform data with LOWER, SUBSTR, REPLACE and CAST:

Data is stored one way in the database doesn’t mean we can’t transform it into something we need with our query. Some common functions we will look at are changing the case of a string, converting a value into a different type, trimming a value and replacing a particular string in a field.

SELECT LOWER(first_name), SUBSTR(last_name, 1,5), REPLACE(last_name, ‘a’, ‘_’), CAST(quiz_points AS CHAR)
FROM people;

Use aliases to shorten field names with AS:

SELECT first_name, UPPER(first_name) AS surname
FROM people;

Add or Modify Data:
To add data to a database table, we can use the Insert keyword using the Insert clause.

INSERT INTO people(first-name) VALUES (‘DEV’);

Let’s take a look at the table

SELECT * FROM people;

Modify data in a table using UPDATE with SET :
Update is useful where we want to change information that is already present, rather than adding new records with different information and optionally use a WHERE clause to specify how to find records to update otherwise the change will apply to the whole table.

Remove data from a table with a delete keyword:
As like an update statement, we need to add a condition to set the scope of deletion, otherwise it will delete everything in the table.

DELETE FROM people
WHERE first_name=’DEVENDER’ and last_name= ‘PALSA’;

Good Programming Guidelines in SQL:

  • Don’t forget to write proper comments
    -Always use table aliases when your SQL statement involves more than one source.
    -If you are getting an error message be sure to read it and break down statement
    -Don’t use column numbers in the ORDER BY clause
    -Put text values in ‘single quotes’
    -Put field names with spaces `inside backticks`
    -Keep a development journal in plain text
    -If you want to find null values use IS NULL or IS NOT NULL instead of ‘0’, ‘no’, ‘false’ (Note that a null value is not the same as zero or no or false)
    -You have to test your matching conditions before using them for a destructive action(UPDATE/DELETE)
  • <script src=”https://platform.linkedin.com/badges/js/profile.js" async defer type=”text/javascript”></script>
  • <div class=”badge-base LI-profile-badge” data-locale=”en_US” data-size=”medium” data-theme=”dark” data-type=”VERTICAL” data-vanity=”devenderpalsa” data-version=”v1"><a class=”badge-base__link LI-simple-link” href=”https://in.linkedin.com/in/devenderpalsa?trk=profile-badge">Devender Palsa</a></div>

DEVENDER PALSA
DEVENDER PALSA

Written by DEVENDER PALSA

SAS Programmer | Data Analytics | Clinical Trials | CDISC

No responses yet