Master-DataScience-Notes/1year/2trimester/Coding for Data Science - Python language/Python/Examples/exercises_in_python.ipynb

711 lines
18 KiB
Plaintext
Raw Permalink Normal View History

2020-02-18 19:18:38 +01:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercises in Python\n",
"\n",
"In the first three lessons, you have seen:\n",
"\n",
"- lists (including list comprehensions);\n",
"- tuples;\n",
"- flow control (selection and loop statements);\n",
"- modules;\n",
"- dictionaries;\n",
"\n",
"We will revise the above topics with five exercises.\n",
"\n",
"## Exercise 1\n",
"\n",
"Write a program to create a multiplication table, from 2 to 20 with a step of 2, of a number.\n",
"\n",
"For instance, given ``n = 10``, the program should output:\n",
"\n",
"``10 x 2 = 20\n",
"10 x 4 = 40\n",
"10 x 6 = 60\n",
"10 x 8 = 80\n",
"10 x 10 = 100\n",
"10 x 12 = 120\n",
"10 x 14 = 140\n",
"10 x 16 = 160\n",
"10 x 18 = 180\n",
"10 x 20 = 200``"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can simply address the required task using a ``for`` loop and ``range``: it represents an immutable sequence of numbers and is commonly used for looping a specific number of times in for loops."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10 x 2 = 20\n",
"10 x 4 = 40\n",
"10 x 6 = 60\n",
"10 x 8 = 80\n",
"10 x 10 = 100\n",
"10 x 12 = 120\n",
"10 x 14 = 140\n",
"10 x 16 = 160\n",
"10 x 18 = 180\n",
"10 x 20 = 200\n"
]
}
],
"source": [
"n = 10\n",
"for i in range(1,n+1):\n",
" print(n,\"x\",i*2,\"=\",n*i*2)\n"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We recall that ``range`` also accept a ``step`` argument. Then, we improve the code avoiding the ``if`` statement and setting ``step = 2``.\n",
"\n",
"We also get rid of the variable ``v``: if you don't need to use its value later, just don't create it."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 86,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10 x 2 = 40\n",
"10 x 4 = 80\n",
"10 x 6 = 120\n",
"10 x 8 = 160\n",
"10 x 10 = 200\n",
"10 x 12 = 240\n",
"10 x 14 = 280\n",
"10 x 16 = 320\n",
"10 x 18 = 360\n",
"10 x 20 = 400\n"
]
}
],
"source": [
"n = 10\n",
"for i in range(2,n*2+1, 2):\n",
" print(n,\"x\",i,\"=\",n*i*2)"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we write the same program in one line using list comprehensions: they provide a compact way to filter elements from a sequence and they implement the following for loop\n",
"\n",
"``result = []\n",
"for <variable> in <sequence>:\n",
" if <condition>:\n",
" result.append(<expression>)``\n",
" \n",
"in the following equivalent form\n",
"\n",
"``[<expression> for <variable> in <sequence> if <condition>]``\n",
"\n",
"In our case, we avoid the filtering part and we can obtain results as:"
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10 x 1 = 20\n",
"10 x 2 = 40\n",
"10 x 3 = 60\n",
"10 x 4 = 80\n",
"10 x 5 = 100\n",
"10 x 6 = 120\n",
"10 x 7 = 140\n",
"10 x 8 = 160\n",
"10 x 9 = 180\n",
"10 x 10 = 200\n"
]
},
{
"data": {
"text/plain": [
"['10x1=20',\n",
" '10x2=40',\n",
" '10x3=60',\n",
" '10x4=80',\n",
" '10x5=100',\n",
" '10x6=120',\n",
" '10x7=140',\n",
" '10x8=160',\n",
" '10x9=180',\n",
" '10x10=200']"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"n = 10\n",
"[print(n,\"x\",i,\"=\",n*i*2) for i in range(1,n+1)]\n",
"\n",
"# To get a list of strings\n",
"[str(n)+\"x\"+str(i)+\"=\"+str(n*i*2) for i in range(1,n+1)]\n"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To obtain the same pretty printing, we create a formatted string inside the list comprehension."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 63,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['10 x 1 = 20',\n",
" '10 x 2 = 40',\n",
" '10 x 3 = 60',\n",
" '10 x 4 = 80',\n",
" '10 x 5 = 100',\n",
" '10 x 6 = 120',\n",
" '10 x 7 = 140',\n",
" '10 x 8 = 160',\n",
" '10 x 9 = 180',\n",
" '10 x 10 = 200']"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = \"{0} x {1} = {2}\"\n",
"[s.format(*[n,i,n*i*2]) for i in range(1,n+1)]\n"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10 x 2 = 20\n",
"10 x 4 = 40\n",
"10 x 6 = 60\n",
"10 x 8 = 80\n",
"10 x 10 = 100\n",
"10 x 12 = 120\n",
"10 x 14 = 140\n",
"10 x 16 = 160\n",
"10 x 18 = 180\n",
"10 x 20 = 200\n"
]
}
],
"source": [
"#OR \n",
"s = \"{0} x {1} = {2}\"\n",
"for i in range(1,n+1):\n",
" print(s.format(*[n,i*2,n*i*2]))"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we use the ``join()`` method of strings: it concatenates each element of an iterable (such our list) to a string and returns the concatenated string. The syntax is ``string.join(iterable)``. An example:\n"
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 79,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'10 x 2 = 2010 x 4 = 4010 x 6 = 6010 x 8 = 8010 x 10 = 10010 x 12 = 12010 x 14 = 14010 x 16 = 16010 x 18 = 18010 x 20 = 200'"
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"l = \"\"\n",
"s = \"{0} x {1} = {2}\"\n",
"#[s.format(*[n,i,n*i*2]) for i in range(1,n+1)]\n",
"l.join([s.format(*[n,i*2,n*i*2]) for i in range(1,n+1)])"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We concatenate with ``\\n`` to go to the next line, then we print."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'10 x 1 = 20 \\n10 x 2 = 40 \\n10 x 3 = 60 \\n10 x 4 = 80 \\n10 x 5 = 100 \\n10 x 6 = 120 \\n10 x 7 = 140 \\n10 x 8 = 160 \\n10 x 9 = 180 \\n10 x 10 = 200 \\n'"
]
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"l = \"\"\n",
"s = \"{0} x {1} = {2} {3}\"\n",
"#[s.format(*[n,i,n*i*2]) for i in range(1,n+1)]\n",
"l.join([s.format(*[n,i,n*i*2],'\\n') for i in range(1,n+1)])"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise 2\n",
"\n",
"Write a Python program to replace the last value of the tuples in a list with the product of the respective first two elements of the tuple. Suppose that the list is composed only by tuples of three integers.\n",
"\n",
"For instance, given in input the list ``l = [(10, 20, 40), (40, 50, 60), (70, 80, 90)]``, the program should output the following list of tuples\n",
"``[(10, 20, 200), (40, 50, 2000), (70, 80, 5600)]``."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 102,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"[(10, 20, 200), (40, 50, 2000), (70, 80, 5600)]"
]
},
"execution_count": 102,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"l = [(10, 20, 40), (40, 50, 60), (70, 80, 90)]\n",
"\n",
"res =[]\n",
"for tup in l:\n",
" res+=[(tup[0],tup[1],tup[0]*tup[1])]\n",
"res"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, suppose that we want to address the same task as above, but the input list is now composed of tuples with a variable number of elements.\n",
"\n",
"For instance, consider the list ``l = [(10, 20, 100, 40), (40, 50, 60), (70, 80, 100, 200, 300, 90)]``, the program should output the following list of tuples ``[(10, 20, 100, 200), (40, 50, 2000), (70, 80, 100, 200, 300, 5600)]``.\n",
"\n",
"We can write a one-line expression using the slice operator, whose syntax is ``[start:stop:step]``."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[(10, 20, 100, 200), (40, 50, 2000), (70, 80, 100, 200, 300, 5600)]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"l = [(10, 20, 100, 40), (40, 50, 60), (70, 80, 100, 200, 300, 90)]\n",
"\n",
"res =[]\n",
"for tup in l:\n",
" tmp = (tup[0:len(tup)-1]) + (tup[0]*tup[1],) \n",
" res+=[tmp]\n",
" \n",
"res"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise 3\n",
"\n",
"Write a program to find the smallest and the largest word in a given string.\n",
"\n",
"For instance, consider the string ``string = \"A quick red fox\"``. The program should output\n",
"\n",
"``Smallest word: A\n",
"Largest word: quick``\n",
"\n",
"A possible strategy is:\n",
"\n",
"- creating a list containing all the words of the sentence;\n",
"- loop on the list and compute both the smallest and the longest word at the same time.\n",
"\n",
"<img src=\"files/es1.svg\" width=\"40%\">"
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 72,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Smallest word: A \n",
"Largest word: quick \n"
]
}
],
"source": [
"string = \"A quick red fox\"\n",
"lis =[]\n",
"minw=\"\"\n",
"maxw=\"\"\n",
"tmp = \"\"\n",
"for w in string:\n",
" tmp+=w \n",
" if(w==\" \" or string.index(w)==len(string)-1):\n",
" lis+=[tmp]\n",
" tmp = \"\"\n",
"\n",
"if len(lis)>0:\n",
" minw= lis[0]\n",
"for i in lis:\n",
" if(len(i)< len(minw)):\n",
" minw = i\n",
" if(len(i)> len(maxw)):\n",
" maxw = i\n",
" \n",
"print(\"Smallest word: \" + minw + \"\\nLargest word: \" + maxw)"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's refine the above code. Point 1 can be adressed using the Python ``split()`` built-in method of strings.\n",
"\n",
"The ``split()`` method splits a string into a list. You can specify the separator, and default separator is any whitespace."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 73,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Smallest word: A\n",
"Largest word: quick\n"
]
}
],
"source": [
"string = \"A quick red fox\"\n",
"lis = string.split(\" \")\n",
"minw= lis[0]\n",
"maxw=\"\"\n",
"for i in lis:\n",
" if(len(i)< len(minw)):\n",
" minw = i\n",
" if(len(i)> len(maxw)):\n",
" maxw = i\n",
" \n",
"print(\"Smallest word: \" + minw + \"\\nLargest word: \" + maxw)"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Also Point 2 can be adressed in a smarter way. We can use the built-in functions ``min()`` and ``max()``, which respectively returns the smallest and largest of the input values.\n",
"\n",
"Such functions provide a parameter named ``key``, which allow to set a function to indicate the sort order. We must specify ``key = len``, as the default ordering for strings is the lexicographic one."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 75,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Smallest word: A\n",
"Largest word: quick\n"
]
}
],
2020-02-18 19:18:38 +01:00
"source": [
2020-02-24 18:50:30 +01:00
"string = \"A quick red fox\"\n",
"lis = string.split(\" \") \n",
"print(\"Smallest word: \" + min(lis,key=len) + \"\\nLargest word: \" + max(lis,key=len))"
2020-02-18 19:18:38 +01:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise 4\n",
"\n",
"Write a Python program to remove duplicates from a list of lists.\n",
"\n",
"For instance, given in input the list ``ls = [[10, 20], [40], [30, 56, 25], [10, 20], [33], [40]]``, the program should output the following list without duplicates: ``[[10, 20], [40], [30, 56, 25], [33]]``.\n",
"\n",
"<img src=\"files/es2.svg\" width = \"90%\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We initialize a new empty list named ``ls_no_dup``. We can address the exercise using two ``for`` loops: with the first one we pick an element from the original list, and with the second one we check if there is another equal element in the ``ls_no_dup`` list. If the element we are currently considering is yet present in ``ls_no_dup`` we don't add it again, otherwise we add it."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 112,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[[30, 56, 25], [33]]"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ls = [[10, 20], [40], [30, 56, 25], [10, 20], [33], [40]]\n",
"\n",
"no_dup = ls.copy()\n",
"\n",
"for i in range(len(ls)):\n",
" for j in range(len(ls)):\n",
" #print(\"compare :\" + str(ls[i]) +\" - \"+ str(ls[j]))\n",
" if(i!=j and ls[i]==ls[j]):\n",
" no_dup.remove(ls[i])\n",
"\n",
"no_dup"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can simplify the above code using in a smarter way the conditional statements:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Other common ways people use to tackle duplicates include:\n",
"- dictionaries: the ``fromkeys()`` method of ``dict`` returns a dictionary with the specified keys. If we cast the dictionary, we obtain a list with no duplicate values.\n",
"- sets: the ``set()`` function, return a set whose does not allow duplicates, by its mathematical definition. Again, if we cast the set to a list, we obtain a list with no duplicate values.\n",
"\n",
"\n",
"Here we can't adopt the latter solutions: you can't use a list as the key in a ``dict``, since ``dict`` keys need to be immutable. The same holds for ``set``."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compare with the following examples:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise 5\n",
"\n",
"Consider the following list of student records:\n",
"\n",
"``students = [{'id': 1, 'success': True, 'name': 'Theo'},\n",
" {'id': 2, 'success': False, 'name': 'Alex'},\n",
" {'id': 3, 'success': True, 'name': 'Ralph'},\n",
" {'id': 4, 'success': True, 'name': 'Ralph'}\n",
" {'id': 5, 'success': False, 'name': 'Theo'}]``\n",
" \n",
"We want to write a program to get the different values associated with \"name\" key.\n",
"\n",
"With the above list, the program should output ``['Theo', 'Alex', 'Ralph']``."
]
},
{
"cell_type": "code",
2020-02-24 18:50:30 +01:00
"execution_count": 148,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Theo', 'Ralph', 'Ralph']"
]
},
"execution_count": 148,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"students = [{'id': 1, 'success': True, 'name': 'Theo'},\n",
" {'id': 2, 'success': False, 'name': 'Alex'},\n",
" {'id': 3, 'success': True, 'name': 'Ralph'},\n",
" {'id': 4, 'success': True, 'name': 'Ralph'},\n",
" {'id': 5, 'success': False, 'name': 'Theo'}]\n",
"\n",
"lis = []\n",
"for dic in students:\n",
" if(dic[\"success\"] == True):\n",
" lis+=[dic[\"name\"]]\n",
"lis\n",
"\n"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We recognize again the pattern that list comprehensions implement, then we can use them:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
2020-02-24 18:50:30 +01:00
"source": [
"#OR\n",
"[dic[\"name\"] for dic in students if(dic[\"success\"] == True)]"
]
2020-02-18 19:18:38 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we can exploit what we learned from Exercise 4, using for instance the ``set()`` function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2020-02-24 18:50:30 +01:00
"version": "3.7.4"
2020-02-18 19:18:38 +01:00
}
},
"nbformat": 4,
"nbformat_minor": 2
}