WWW.APPSERVGRID.COM https://www.art2dec.co/fort/ |
|
Python3 Language Tutorial-3. June 2018 https://www.art2dec.co/fort/viewtopic.php?f=23&t=7411 |
Page 1 of 1 |
Author: | admin [ Mon Feb 11, 2019 2:59 am ] |
Post subject: | Python3 Language Tutorial-3. June 2018 |
Inheritance A class that inherits from another class is called a subclass. A class that is inherited from is called a superclass. If a class inherits from another with the same attributes or methods, it overrides them. class Wolf: def __init__(self, name, color): self.name = name self.color = color def bark(self): print("Grr...") class Dog(Wolf): def bark(self): print("Woof") husky = Dog("Max", "grey") husky.bark() Result: >>> Woof >>> In the example above, Wolf is the superclass, Dog is the subclass. ------------------------------------------------------------------------------ Inheritance Inheritance can also be indirect. One class can inherit from another, and that class can inherit from a third class. Example: class A: def method(self): print("A method") class B(A): def another_method(self): print("B method") class C(B): def third_method(self): print("C method") c = C() c.method() c.another_method() c.third_method() Result: >>> A method B method C method >>> However, circular inheritance is not possible. ------------------------------------------------------------------------ Inheritance The function super is a useful inheritance-related function that refers to the parent class. It can be used to find the method with a certain name in an object's superclass. Example: class A: def spam(self): print(1) class B(A): def spam(self): print(2) super().spam() B().spam() Result: >>> 2 1 >>> super().spam() calls the spam method of the superclass. -------------------------------------------------------------------------- Magic Methods Magic methods are special methods which have double underscores at the beginning and end of their names. They are also known as dunders. So far, the only one we have encountered is __init__, but there are several others. They are used to create functionality that can't be represented as a normal method. One common use of them is operator overloading. This means defining operators for custom classes that allow operators such as + and * to be used on them. An example magic method is __add__ for +. class Vector2D: def __init__(self, x, y): self.x = x self.y = y def __add__(self, other): return Vector2D(self.x + other.x, self.y + other.y) first = Vector2D(5, 7) second = Vector2D(3, 9) result = first + second print(result.x) print(result.y) Result: >>> 8 16 >>> The __add__ method allows for the definition of a custom behavior for the + operator in our class. As you can see, it adds the corresponding attributes of the objects and returns a new object, containing the result. Once it's defined, we can add two objects of the class together. -------------------------------------------------------------------------------------- Magic Methods More magic methods for common operators: __sub__ for - __mul__ for * __truediv__ for / __floordiv__ for // __mod__ for % __pow__ for ** __and__ for & __xor__ for ^ __or__ for | The expression x + y is translated into x.__add__(y). However, if x hasn't implemented __add__, and x and y are of different types, then y.__radd__(x) is called. There are equivalent r methods for all magic methods just mentioned. Example: class SpecialString: def __init__(self, cont): self.cont = cont def __truediv__(self, other): line = "=" * len(other.cont) return "\n".join([self.cont, line, other.cont]) spam = SpecialString("spam") hello = SpecialString("Hello world!") print(spam / hello) Try It Yourself Result: >>> spam ============ Hello world! >>> In the example above, we defined the division operation for our class SpecialString. ----------------------------------------------------------------------------------- Magic Methods Python also provides magic methods for comparisons. __lt__ for < __le__ for <= __eq__ for == __ne__ for != __gt__ for > __ge__ for >= If __ne__ is not implemented, it returns the opposite of __eq__. There are no other relationships between the other operators. Example: class SpecialString: def __init__(self, cont): self.cont = cont def __gt__(self, other): for index in range(len(other.cont)+1): result = other.cont[:index] + ">" + self.cont result += ">" + other.cont[index:] print(result) spam = SpecialString("spam") eggs = SpecialString("eggs") spam > eggs Result: >>> >spam>eggs e>spam>ggs eg>spam>gs egg>spam>s eggs>spam> >>> As you can see, you can define any custom behavior for the overloaded operators. ------------------------------------------------------------------------------------ Magic Methods There are several magic methods for making classes act like containers. __len__ for len() __getitem__ for indexing __setitem__ for assigning to indexed values __delitem__ for deleting indexed values __iter__ for iteration over objects (e.g., in for loops) __contains__ for in There are many other magic methods that we won't cover here, such as __call__ for calling objects as functions, and __int__, __str__, and the like, for converting objects to built-in types. Example: import random class VagueList: def __init__(self, cont): self.cont = cont def __getitem__(self, index): return self.cont[index + random.randint(-1, 1)] def __len__(self): return random.randint(0, len(self.cont)*2) vague_list = VagueList(["A", "B", "C", "D", "E"]) print(len(vague_list)) print(len(vague_list)) print(vague_list[2]) print(vague_list[2]) Try It Yourself Result: >>> 6 7 D C >>> We have overridden the len() function for the class VagueList to return a random number. The indexing function also returns a random item in a range from the list, based on the expression. -------------------------------------------------------------------------------- Object Lifecycle The lifecycle of an object is made up of its creation, manipulation, and destruction. The first stage of the life-cycle of an object is the definition of the class to which it belongs. The next stage is the instantiation of an instance, when __init__ is called. Memory is allocated to store the instance. Just before this occurs, the __new__ method of the class is called. This is usually overridden only in special cases. After this has happened, the object is ready to be used. Other code can then interact with the object, by calling functions on it and accessing its attributes. Eventually, it will finish being used, and can be destroyed. ------------------------------------------------------------------------------------ Object Lifecycle When an object is destroyed, the memory allocated to it is freed up, and can be used for other purposes. Destruction of an object occurs when its reference count reaches zero. Reference count is the number of variables and other elements that refer to an object. If nothing is referring to it (it has a reference count of zero) nothing can interact with it, so it can be safely deleted. In some situations, two (or more) objects can be referred to by each other only, and therefore can be deleted as well. The del statement reduces the reference count of an object by one, and this often leads to its deletion. The magic method for the del statement is __del__. The process of deleting objects when they are no longer needed is called garbage collection. In summary, an object's reference count increases when it is assigned a new name or placed in a container (list, tuple, or dictionary). The object's reference count decreases when it's deleted with del, its reference is reassigned, or its reference goes out of scope. When an object's reference count reaches zero, Python automatically deletes it. Example: a = 42 # Create object <42> b = a # Increase ref. count of <42> c = [a] # Increase ref. count of <42> del a # Decrease ref. count of <42> b = 100 # Decrease ref. count of <42> c[0] = -1 # Decrease ref. count of <42> Lower level languages like C don't have this kind of automatic memory management. ------------------------------------------------------------------------------------- Data Hiding A key part of object-oriented programming is encapsulation, which involves packaging of related variables and functions into a single easy-to-use object - an instance of a class. A related concept is data hiding, which states that implementation details of a class should be hidden, and a clean standard interface be presented for those who want to use the class. In other programming languages, this is usually done with private methods and attributes, which block external access to certain methods and attributes in a class. The Python philosophy is slightly different. It is often stated as "we are all consenting adults here", meaning that you shouldn't put arbitrary restrictions on accessing parts of a class. Hence there are no ways of enforcing a method or attribute be strictly private. However, there are ways to discourage people from accessing parts of a class, such as by denoting that it is an implementation detail, and should be used at their own risk. ------------------------------------------------------------------------------------ Data Hiding Weakly private methods and attributes have a single underscore at the beginning. This signals that they are private, and shouldn't be used by external code. However, it is mostly only a convention, and does not stop external code from accessing them. Its only actual effect is that from module_name import * won't import variables that start with a single underscore. Example: class Queue: def __init__(self, contents): self._hiddenlist = list(contents) def push(self, value): self._hiddenlist.insert(0, value) def pop(self): return self._hiddenlist.pop(-1) def __repr__(self): return "Queue({})".format(self._hiddenlist) queue = Queue([1, 2, 3]) print(queue) queue.push(0) print(queue) queue.pop() print(queue) print(queue._hiddenlist) Result: >>> Queue([1, 2, 3]) Queue([0, 1, 2, 3]) Queue([0, 1, 2]) [0, 1, 2] >>> In the code above, the attribute _hiddenlist is marked as private, but it can still be accessed in the outside code. The __repr__ magic method is used for string representation of the instance. -------------------------------------------------------------------------------------- Data Hiding Strongly private methods and attributes have a double underscore at the beginning of their names. This causes their names to be mangled, which means that they can't be accessed from outside the class. The purpose of this isn't to ensure that they are kept private, but to avoid bugs if there are subclasses that have methods or attributes with the same names. Name mangled methods can still be accessed externally, but by a different name. The method __privatemethod of class Spam could be accessed externally with _Spam__privatemethod. Example: class Spam: __egg = 7 def print_egg(self): print(self.__egg) s = Spam() s.print_egg() print(s._Spam__egg) print(s.__egg) Result: >>> 7 7 AttributeError: 'Spam' object has no attribute '__egg' >>> Basically, Python protects those members by internally changing the name to include the class name. ---------------------------------------------------------------------------------------- Class Methods Methods of objects we've looked at so far are called by an instance of a class, which is then passed to the self parameter of the method. Class methods are different - they are called by a class, which is passed to the cls parameter of the method. A common use of these are factory methods, which instantiate an instance of a class, using different parameters than those usually passed to the class constructor. Class methods are marked with a classmethod decorator. Example: class Rectangle: def __init__(self, width, height): self.width = width self.height = height def calculate_area(self): return self.width * self.height @classmethod def new_square(cls, side_length): return cls(side_length, side_length) square = Rectangle.new_square(5) print(square.calculate_area()) Try It Yourself Result: >>> 25 >>> new_square is a class method and is called on the class, rather than on an instance of the class. It returns a new object of the class cls. Technically, the parameters self and cls are just conventions; they could be changed to anything else. However, they are universally followed, so it is wise to stick to using them. ------------------------------------------------------------------------------------------------- Static Methods Static methods are similar to class methods, except they don't receive any additional arguments; they are identical to normal functions that belong to a class. They are marked with the staticmethod decorator. Example: class Pizza: def __init__(self, toppings): self.toppings = toppings @staticmethod def validate_topping(topping): if topping == "pineapple": raise ValueError("No pineapples!") else: return True ingredients = ["cheese", "onions", "spam"] if all(Pizza.validate_topping(i) for i in ingredients): pizza = Pizza(ingredients) Try It Yourself Static methods behave like plain functions, except for the fact that you can call them from an instance of the class. -------------------------------------------------------------------------------------------------- Properties Properties provide a way of customizing access to instance attributes. They are created by putting the property decorator above a method, which means when the instance attribute with the same name as the method is accessed, the method will be called instead. One common use of a property is to make an attribute read-only. Example: class Pizza: def __init__(self, toppings): self.toppings = toppings @property def pineapple_allowed(self): return False pizza = Pizza(["cheese", "tomato"]) print(pizza.pineapple_allowed) pizza.pineapple_allowed = True Result: >>> False AttributeError: can't set attribute >>> ---------------------------------------------------------------------------------------------------- Properties Properties can also be set by defining setter/getter functions. The setter function sets the corresponding property's value. The getter gets the value. To define a setter, you need to use a decorator of the same name as the property, followed by a dot and the setter keyword. The same applies to defining getter functions. Example: class Pizza: def __init__(self, toppings): self.toppings = toppings self._pineapple_allowed = False @property def pineapple_allowed(self): return self._pineapple_allowed @pineapple_allowed.setter def pineapple_allowed(self, value): if value: password = input("Enter the password: ") if password == "Sw0rdf1sh!": self._pineapple_allowed = value else: raise ValueError("Alert! Intruder!") pizza = Pizza(["cheese", "tomato"]) print(pizza.pineapple_allowed) pizza.pineapple_allowed = True print(pizza.pineapple_allowed) Result: >>> False Enter the password to permit pineapple: Sw0rdf1sh! True --------------------------------------------------------------------------------------------- A Simple Game Object-orientation is very useful when managing different objects and their relations. That is especially useful when you are developing games with different characters and features. Let's look at an example project that shows how classes are used in game development. The game to be developed is an old fashioned text-based adventure game. Below is the function handling input and simple parsing. def get_input(): command = input(": ").split() verb_word = command[0] if verb_word in verb_dict: verb = verb_dict[verb_word] else: print("Unknown verb {}". format(verb_word)) return if len(command) >= 2: noun_word = command[1] print (verb(noun_word)) else: print(verb("nothing")) def say(noun): return 'You said "{}"'.format(noun) verb_dict = { "say": say, } while True: get_input() Result: >>> : say Hello! You said "Hello!" : say Goodbye! You said "Goodbye!" : test Unknown verb test The code above takes input from the user, and tries to match the first word with a command in verb_dict. If a match is found, the corresponding function is called. ------------------------------------------------------------------------------------- A Simple Game The next step is to use classes to represent game objects. class GameObject: class_name = "" desc = "" objects = {} def __init__(self, name): self.name = name GameObject.objects[self.class_name] = self def get_desc(self): return self.class_name + "\n" + self.desc class Goblin(GameObject): class_name = "goblin" desc = "A foul creature" goblin = Goblin("Gobbly") def examine(noun): if noun in GameObject.objects: return GameObject.objects[noun].get_desc() else: return "There is no {} here.".format(noun) We created a Goblin class, which inherits from the GameObjects class. We also created a new function examine, which returns the objects description. Now we can add a new "examine" verb to our dictionary and try it out! verb_dict = { "say": say, "examine": examine, } Combine this code with the one in our previous example, and run the program. >>> : say Hello! You said "Hello!" : examine goblin goblin A foul creature : examine elf There is no elf here. ------------------------------------------------------------------------------ A Simple Game This code adds more detail to the Goblin class and allows you to fight goblins. class Goblin(GameObject): def __init__(self, name): self.class_name = "goblin" self.health = 3 self._desc = " A foul creature" super().__init__(name) @property def desc(self): if self.health >=3: return self._desc elif self.health == 2: health_line = "It has a wound on its knee." elif self.health == 1: health_line = "Its left arm has been cut off!" elif self.health <= 0: health_line = "It is dead." return self._desc + "\n" + health_line @desc.setter def desc(self, value): self._desc = value def hit(noun): if noun in GameObject.objects: thing = GameObject.objects[noun] if type(thing) == Goblin: thing.health = thing.health - 1 if thing.health <= 0: msg = "You killed the goblin!" else: msg = "You hit the {}".format(thing.class_name) else: msg ="There is no {} here.".format(noun) return msg Result: >>> : hit goblin You hit the goblin : examine goblin goblin A foul creature It has a wound on its knee. : hit goblin You hit the goblin : hit goblin You killed the goblin! : examine goblin A goblin goblin A foul creature It is dead. : This was just a simple sample. You could create different classes (e.g., elves, orcs, humans), fight them, make them fight each other, and so on. ---------------------------------------------------------------------------- Regular Expressions Regular expressions are a powerful tool for various kinds of string manipulation. They are a domain specific language (DSL) that is present as a library in most modern programming languages, not just Python. They are useful for two main tasks: - verifying that strings match a pattern (for instance, that a string has the format of an email address), - performing substitutions in a string (such as changing all American spellings to British ones). Domain specific languages are highly specialized mini programming languages. Regular expressions are a popular example, and SQL (for database manipulation) is another. Private domain-specific languages are often used for specific industrial purposes. ------------------------------------------------------------------------------- Regular Expressions Regular expressions in Python can be accessed using the re module, which is part of the standard library. After you've defined a regular expression, the re.match function can be used to determine whether it matches at the beginning of a string. If it does, match returns an object representing the match, if not, it returns None. To avoid any confusion while working with regular expressions, we would use raw strings as r"expression". Raw strings don't escape anything, which makes use of regular expressions easier. Example: import re pattern = r"spam" if re.match(pattern, "spamspamspam"): print("Match") else: print("No match") Try It Yourself Result: >>> Match >>> The above example checks if the pattern "spam" matches the string and prints "Match" if it does. Here the pattern is a simple word, but there are various characters, which would have special meaning when they are used in a regular expression. -------------------------------------------------------------------------------------- Regular Expressions Other functions to match patterns are re.search and re.findall. The function re.search finds a match of a pattern anywhere in the string. The function re.findall returns a list of all substrings that match a pattern. Example: import re pattern = r"spam" if re.match(pattern, "eggspamsausagespam"): print("Match") else: print("No match") if re.search(pattern, "eggspamsausagespam"): print("Match") else: print("No match") print(re.findall(pattern, "eggspamsausagespam")) Result: >>> No match Match ['spam', 'spam'] >>> In the example above, the match function did not match the pattern, as it looks at the beginning of the string. The search function found a match in the string. The function re.finditer does the same thing as re.findall, except it returns an iterator, rather than a list. -------------------------------------------------------------------------- Regular Expressions The regex search returns an object with several methods that give details about it. These methods include group which returns the string matched, start and end which return the start and ending positions of the first match, and span which returns the start and end positions of the first match as a tuple. Example: import re pattern = r"pam" match = re.search(pattern, "eggspamsausage") if match: print(match.group()) print(match.start()) print(match.end()) print(match.span()) Result: >>> pam 4 7 (4, 7) >>> ----------------------------------------------------------------------------- Metacharacters Metacharacters are what make regular expressions more powerful than normal string methods. They allow you to create regular expressions to represent concepts like "one or more repetitions of a vowel". The existence of metacharacters poses a problem if you want to create a regular expression (or regex) that matches a literal metacharacter, such as "$". You can do this by escaping the metacharacters by putting a backslash in front of them. However, this can cause problems, since backslashes also have an escaping function in normal Python strings. This can mean putting three or four backslashes in a row to do all the escaping. To avoid this, you can use a raw string, which is a normal string with an "r" in front of it. We saw usage of raw strings in the previous lesson. ------------------------------------------------------------------------------ Metacharacters The first metacharacter we will look at is . (dot). This matches any character, other than a new line. Example: import re pattern = r"gr.y" if re.match(pattern, "grey"): print("Match 1") if re.match(pattern, "gray"): print("Match 2") if re.match(pattern, "blue"): print("Match 3") Result: >>> Match 1 Match 2 >>> ------------------------------------------------------------------------------- Metacharacters The next two metacharacters are ^ and $. These match the start and end of a string, respectively. Example: import re pattern = r"^gr.y$" if re.match(pattern, "grey"): print("Match 1") if re.match(pattern, "gray"): print("Match 2") if re.match(pattern, "stingray"): print("Match 3") Try It Yourself Result: >>> Match 1 Match 2 >>> The pattern "^gr.y$" means that the string should start with gr, then follow with any character, except a newline, and end with y. -------------------------------------------------------------------------------------- Character Classes Character classes provide a way to match only one of a specific set of characters. A character class is created by putting the characters it matches inside square brackets. Example: import re pattern = r"[aeiou]" if re.search(pattern, "grey"): print("Match 1") if re.search(pattern, "qwertyuiop"): print("Match 2") if re.search(pattern, "rhythm myths"): print("Match 3") Try It Yourself Result: >>> Match 1 Match 2 >>> The pattern [aeiou] in the search function matches all strings that contain any one of the characters defined. ----------------------------------------------------------------------------------------- Character Classes Character classes can also match ranges of characters. Some examples: The class [a-z] matches any lowercase alphabetic character. The class [G-P] matches any uppercase character from G to P. The class [0-9] matches any digit. Multiple ranges can be included in one class. For example, [A-Za-z] matches a letter of any case. Example: import re pattern = r"[A-Z][A-Z][0-9]" if re.search(pattern, "LS8"): print("Match 1") if re.search(pattern, "E3"): print("Match 2") if re.search(pattern, "1ab"): print("Match 3") Try It Yourself Result: >>> Match 1 >>> The pattern in the example above matches strings that contain two alphabetic uppercase letters followed by a digit. ----------------------------------------------------------------------------------------------- Character Classes Place a ^ at the start of a character class to invert it. This causes it to match any character other than the ones included. Other metacharacters such as $ and ., have no meaning within character classes. The metacharacter ^ has no meaning unless it is the first character in a class. Example: import re pattern = r"[^A-Z]" if re.search(pattern, "this is all quiet"): print("Match 1") if re.search(pattern, "AbCdEfG123"): print("Match 2") if re.search(pattern, "THISISALLSHOUTING"): print("Match 3") Result: >>> Match 1 Match 2 >>> The pattern [^A-Z] excludes uppercase strings. Note, that the ^ should be inside the brackets to invert the character class. -------------------------------------------------------------------------------------------- Metacharacters Some more metacharacters are *, +, ?, { and }. These specify numbers of repetitions. The metacharacter * means "zero or more repetitions of the previous thing". It tries to match as many repetitions as possible. The "previous thing" can be a single character, a class, or a group of characters in parentheses. Example: import re pattern = r"egg(spam)*" if re.match(pattern, "egg"): print("Match 1") if re.match(pattern, "eggspamspamegg"): print("Match 2") if re.match(pattern, "spam"): print("Match 3") Try It Yourself Result: >>> Match 1 Match 2 >>> The example above matches strings that start with "egg" and follow with zero or more "spam"s. ----------------------------------------------------------------------------------------------- Metacharacters The metacharacter + is very similar to *, except it means "one or more repetitions", as opposed to "zero or more repetitions". Example: import re pattern = r"g+" if re.match(pattern, "g"): print("Match 1") if re.match(pattern, "gggggggggggggg"): print("Match 2") if re.match(pattern, "abc"): print("Match 3") Result: >>> Match 1 Match 2 >>> To summarize: * matches 0 or more occurrences of the preceding expression. + matches 1 or more occurrence of the preceding expression. ---------------------------------------------------------------------------------------------------- Metacharacters The metacharacter ? means "zero or one repetitions". Example: import re pattern = r"ice(-)?cream" if re.match(pattern, "ice-cream"): print("Match 1") if re.match(pattern, "icecream"): print("Match 2") if re.match(pattern, "sausages"): print("Match 3") if re.match(pattern, "ice--ice"): print("Match 4") Result: >>> Match 1 Match 2 >>> --------------------------------------------------------------------------------------------------------- Curly Braces Curly braces can be used to represent the number of repetitions between two numbers. The regex {x,y} means "between x and y repetitions of something". Hence {0,1} is the same thing as ?. If the first number is missing, it is taken to be zero. If the second number is missing, it is taken to be infinity. Example: import re pattern = r"9{1,3}$" if re.match(pattern, "9"): print("Match 1") if re.match(pattern, "999"): print("Match 2") if re.match(pattern, "9999"): print("Match 3") Result: >>> Match 1 Match 2 >>> "9{1,3}$" matches string that have 1 to 3 nines. --------------------------------------------------------------------------------------------------------- Groups A group can be created by surrounding part of a regular expression with parentheses. This means that a group can be given as an argument to metacharacters such as * and ?. Example: import re pattern = r"egg(spam)*" if re.match(pattern, "egg"): print("Match 1") if re.match(pattern, "eggspamspamspamegg"): print("Match 2") if re.match(pattern, "spam"): print("Match 3") (spam) represents a group in the example pattern shown above. Result: >>> Match 1 Match 2 >>> ------------------------------------------------------------------------------------------------------------ Groups The content of groups in a match can be accessed using the group function. A call of group(0) or group() returns the whole match. A call of group(n), where n is greater than 0, returns the nth group from the left. The method groups() returns all groups up from 1. Example: import re pattern = r"a(bc)(de)(f(g)h)i" match = re.match(pattern, "abcdefghijklmnop") if match: print(match.group()) print(match.group(0)) print(match.group(1)) print(match.group(2)) print(match.groups()) Result: >>> abcdefghi abcdefghi bc de ('bc', 'de', 'fgh', 'g') >>> As you can see from the example above, groups can be nested. ------------------------------------------------------------------------------------- Groups There are several kinds of special groups. Two useful ones are named groups and non-capturing groups. Named groups have the format (?P<name>...), where name is the name of the group, and ... is the content. They behave exactly the same as normal groups, except they can be accessed by group(name) in addition to its number. Non-capturing groups have the format (?:...). They are not accessible by the group method, so they can be added to an existing regular expression without breaking the numbering. Example: import re pattern = r"(?P<first>abc)(?:def)(ghi)" match = re.match(pattern, "abcdefghi") if match: print(match.group("first")) print(match.groups()) Result: >>> abc ('abc', 'ghi') >>> --------------------------------------------------------------------------------------- Metacharacters Another important metacharacter is |. This means "or", so red|blue matches either "red" or "blue". Example: import re pattern = r"gr(a|e)y" match = re.match(pattern, "gray") if match: print ("Match 1") match = re.match(pattern, "grey") if match: print ("Match 2") match = re.match(pattern, "griy") if match: print ("Match 3") Result: >>> Match 1 Match 2 >>> ----------------------------------------------------------------------------------------- Special Sequences There are various special sequences you can use in regular expressions. They are written as a backslash followed by another character. One useful special sequence is a backslash and a number between 1 and 99, e.g., \1 or \17. This matches the expression of the group of that number. Example: import re pattern = r"(.+) \1" match = re.match(pattern, "word word") if match: print ("Match 1") match = re.match(pattern, "?! ?!") if match: print ("Match 2") match = re.match(pattern, "abc cde") if match: print ("Match 3") Try It Yourself Result: >>> Match 1 Match 2 >>> Note, that "(.+) \1" is not the same as "(.+) (.+)", because \1 refers to the first group's subexpression, which is the matched expression itself, and not the regex pattern. --------------------------------------------------------------------------------------------- Special Sequences More useful special sequences are \d, \s, and \w. These match digits, whitespace, and word characters respectively. In ASCII mode they are equivalent to [0-9], [ \t\n\r\f\v], and [a-zA-Z0-9_]. In Unicode mode they match certain other characters, as well. For instance, \w matches letters with accents. Versions of these special sequences with upper case letters - \D, \S, and \W - mean the opposite to the lower-case versions. For instance, \D matches anything that isn't a digit. Example: import re pattern = r"(\D+\d)" match = re.match(pattern, "Hi 999!") if match: print("Match 1") match = re.match(pattern, "1, 23, 456!") if match: print("Match 2") match = re.match(pattern, " ! $?") if match: print("Match 3") Result: >>> Match 1 >>> (\D+\d) matches one or more non-digits followed by a digit. --------------------------------------------------------------------------------------------- Special Sequences Additional special sequences are \A, \Z, and \b. The sequences \A and \Z match the beginning and end of a string, respectively. The sequence \b matches the empty string between \w and \W characters, or \w characters and the beginning or end of the string. Informally, it represents the boundary between words. The sequence \B matches the empty string anywhere else. Example: import re pattern = r"\b(cat)\b" match = re.search(pattern, "The cat sat!") if match: print ("Match 1") match = re.search(pattern, "We s>cat<tered?") if match: print ("Match 2") match = re.search(pattern, "We scattered.") if match: print ("Match 3") Result: >>> Match 1 Match 2 >>> "\b(cat)\b" basically matches the word "cat" surrounded by word boundaries. ------------------------------------------------------------------------------------------------ Email Extraction To demonstrate a sample usage of regular expressions, lets create a program to extract email addresses from a string. Suppose we have a text that contains an email address: str = "Please contact info@sololearn.com for assistance" Our goal is to extract the substring "info@sololearn.com". A basic email address consists of a word and may include dots or dashes. This is followed by the @ sign and the domain name (the name, a dot, and the domain name suffix). This is the basis for building our regular expression. pattern = r"([\w\.-]+)@([\w\.-]+)(\.[\w\.]+)" [\w\.-]+ matches one or more word character, dot or dash. The regex above says that the string should contain a word (with dots and dashes allowed), followed by the @ sign, then another similar word, then a dot and another word. Our regex contains three groups: 1 - first part of the email address. 2 - domain name without the suffix. 3 - the domain suffix. -------------------------------------------------------------------------------------------------- Email Extraction Putting it all together: import re pattern = r"([\w\.-]+)@([\w\.-]+)(\.[\w\.]+)" str = "Please contact info@sololearn.com for assistance" match = re.search(pattern, str) if match: print(match.group()) Result: >>> info@sololearn.com >>> In case the string contains multiple email addresses, we could use the re.findall method instead of re.search, to extract all email addresses. The regex in this example is for demonstration purposes only. A much more complex regex is required to fully validate an email address. ----------------------------------------------------------------------------------------------------- The Zen of Python Writing programs that actually do what they are supposed to do is just one component of being a good Python programmer. It's also important to write clean code that is easily understood, even weeks after you've written it. One way of doing this is to follow the Zen of Python, a somewhat tongue-in-cheek set of principles that serves as a guide to programming the Pythoneer way. Use the following code to access the Zen of Python. import this Result: The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those! ----------------------------------------------------------------------------------------------------- The Zen of Python Some lines in the Zen of Python may need more explanation. Explicit is better than implicit: It is best to spell out exactly what your code is doing. This is why adding a numeric string to an integer requires explicit conversion, rather than having it happen behind the scenes, as it does in other languages. Flat is better than nested: Heavily nested structures (lists of lists, of lists, and on and on…) should be avoided. Errors should never pass silently: In general, when an error occurs, you should output some sort of error message, rather than ignoring it. There are 20 principles in the Zen of Python, but only 19 lines of text. The 20th principle is a matter of opinion, but our interpretation is that the blank line means "Use whitespace". The line "There should be one - and preferably only one - obvious way to do it" references and contradicts the Perl language philosophy that there should be more than one way to do it. -------------------------------------------------------------------------------------------------------- PEP Python Enhancement Proposals (PEP) are suggestions for improvements to the language, made by experienced Python developers. PEP 8 is a style guide on the subject of writing readable code. It contains a number of guidelines in reference to variable names, which are summarized here: - modules should have short, all-lowercase names; - class names should be in the CapWords style; - most variables and function names should be lowercase_with_underscores; - constants (variables that never change value) should be CAPS_WITH_UNDERSCORES; - names that would clash with Python keywords (such as 'class' or 'if') should have a trailing underscore. PEP 8 also recommends using spaces around operators and after commas to increase readability. However, whitespace should not be overused. For instance, avoid having any space directly inside any type of brackets. ----------------------------------------------------------------------------------------------------------- PEP 8 Other PEP 8 suggestions include the following: - lines shouldn't be longer than 80 characters; - 'from module import *' should be avoided; - there should only be one statement per line. It also suggests that you use spaces, rather than tabs, to indent. However, to some extent, this is a matter of personal preference. If you use spaces, only use 4 per line. It's more important to choose one and stick to it. The most important advice in the PEP is to ignore it when it makes sense to do so. Don't bother with following PEP suggestions when it would cause your code to be less readable; inconsistent with the surrounding code; or not backwards compatible. However, by and large, following PEP 8 will greatly enhance the quality of your code. Some other notable PEPs that cover code style: PEP 20: The Zen of Python PEP 257: Style Conventions for Docstrings ------------------------------------------------------------------------------------------------------------- Function Arguments Python allows to have function with varying number of arguments. Using *args as a function parameter enables you to pass an arbitrary number of arguments to that function. The arguments are then accessible as the tuple args in the body of the function. Example: def function(named_arg, *args): print(named_arg) print(args) function(1, 2, 3, 4, 5) Result: >>> 1 (2, 3, 4, 5) >>> The parameter *args must come after the named parameters to a function. The name args is just a convention; you can choose to use another. --------------------------------------------------------------------------------------------------------------- Default Values Named parameters to a function can be made optional by giving them a default value. These must come after named parameters without a default value. Example: def function(x, y, food="spam"): print(food) function(1, 2) function(3, 4, "egg") Result: >>> spam egg >>> In case the argument is passed in, the default value is ignored. If the argument is not passed in, the default value is used. ----------------------------------------------------------------------------------------------------------------- Function Arguments **kwargs (standing for keyword arguments) allows you to handle named arguments that you have not defined in advance. The keyword arguments return a dictionary in which the keys are the argument names, and the values are the argument values. Example: def my_func(x, y=7, *args, **kwargs): print(kwargs) my_func(2, 3, 4, 5, 6, a=7, b=8) Result: >>> {'a': 7, 'b': 8} >>> a and b are the names of the arguments that we passed to the function call. The arguments returned by **kwargs are not included in *args. -------------------------------------------------------------------------------------------------------------------- Tuple Unpacking Tuple unpacking allows you to assign each item in an iterable (often a tuple) to a variable. Example: numbers = (1, 2, 3) a, b, c = numbers print(a) print(b) print(c) Result: >>> 1 2 3 >>> This can be also used to swap variables by doing a, b = b, a , since b, a on the right hand side forms the tuple (b, a) which is then unpacked. --------------------------------------------------------------------------------------------------------------------- Tuple Unpacking A variable that is prefaced with an asterisk (*) takes all values from the iterable that are left over from the other variables. Example: a, b, *c, d = [1, 2, 3, 4, 5, 6, 7, 8, 9] print(a) print(b) print(c) print(d) Try It Yourself Result: >>> 1 2 [3, 4, 5, 6, 7, 8] 9 >>> ---------------------------------------------------------------------------------------------------------------------- Ternary Operator Conditional expressions provide the functionality of if statements while using less code. They shouldn't be overused, as they can easily reduce readability, but they are often useful when assigning variables. Conditional expressions are also known as applications of the ternary operator. Example: a = 7 b = 1 if a >= 5 else 42 print(b) Result: >>> 1 >>> The ternary operator checks the condition and returns the corresponding value. In the example above, as the condition is true, b is assigned 1. If a was less than 5, it would have been assigned 42. Another example: status = 1 msg = "Logout" if status == 1 else "Login" The ternary operator is so called because, unlike most operators, it takes three arguments. ------------------------------------------------------------------------------------------------------------------------ else The else statement is most commonly used along with the if statement, but it can also follow a for or while loop, which gives it a different meaning. With the for or while loop, the code within it is called if the loop finishes normally (when a break statement does not cause an exit from the loop). Example: for i in range(10): if i == 999: break else: print("Unbroken 1") for i in range(10): if i == 5: break else: print("Unbroken 2") Result: >>> Unbroken 1 >>> The first for loop executes normally, resulting in the printing of "Unbroken 1". The second loop exits due to a break, which is why it's else statement is not executed. ----------------------------------------------------------------------------------------------------------------------- else The else statement can also be used with try/except statements. In this case, the code within it is only executed if no error occurs in the try statement. Example: try: print(1) except ZeroDivisionError: print(2) else: print(3) try: print(1/0) except ZeroDivisionError: print(4) else: print(5) Result: >>> 1 3 4 >>> ----------------------------------------------------------------------------------------------------------------------- __main__ Most Python code is either a module to be imported, or a script that does something. However, sometimes it is useful to make a file that can be both imported as a module and run as a script. To do this, place script code inside if __name__ == "__main__". This ensures that it won't be run if the file is imported. Example: def function(): print("This is a module function") if __name__=="__main__": print("This is a script") Try It Yourself Result: >>> This is a script >>> When the Python interpreter reads a source file, it executes all of the code it finds in the file. Before executing the code, it defines a few special variables. For example, if the Python interpreter is running that module (the source file) as the main program, it sets the special __name__ variable to have a value "__main__". If this file is being imported from another module, __name__ will be set to the module's name. ------------------------------------------------------------------------------------------------------------------------ __main__ If we save the code from our previous example as a file called sololearn.py, we can then import it to another script as a module, using the name sololearn. sololearn.py def function(): print("This is a module function") if __name__=="__main__": print("This is a script") Try It Yourself some_script.py import sololearn sololearn.function() Result: >>> This is a module function >>> Basically, we've created a custom module called sololearn, and then used it in another script. ------------------------------------------------------------------------------------------------------------------------- Major 3rd-Party Libraries The Python standard library alone contains extensive functionality. However, some tasks require the use of third-party libraries. Some major third-party libraries: Django: The most frequently used web framework written in Python, Django powers websites that include Instagram and Disqus. It has many useful features, and whatever features it lacks are covered by extension packages. CherryPy and Flask are also popular web frameworks. For scraping data from websites, the library BeautifulSoup is very useful, and leads to better results than building your own scraper with regular expressions. While Python does offer modules for programmatically accessing websites, such as urllib, they are quite cumbersome to use. Third-party library requests make it much easier to use HTTP requests. -------------------------------------------------------------------------------------------------------------------------- Major 3rd-Party Libraries A number of third-party modules are available that make it much easier to carry out scientific and mathematical computing with Python. The module matplotlib allows you to create graphs based on data in Python. The module NumPy allows for the use of multidimensional arrays that are much faster than the native Python solution of nested lists. It also contains functions to perform mathematical operations such as matrix transformations on the arrays. The library SciPy contains numerous extensions to the functionality of NumPy. Python can also be used for game development. Usually, it is used as a scripting language for games written in other languages, but it can be used to make games by itself. For 3D games, the library Panda3D can be used. For 2D games, you can use pygame. ------------------------------------------------------------------------------------------------------------------------------ Packaging In Python, the term packaging refers to putting modules you have written in a standard format, so that other programmers can install and use them with ease. This involves use of the modules setuptools and distutils. The first step in packaging is to organize existing files correctly. Place all of the files you want to put in a library in the same parent directory. This directory should also contain a file called __init__.py, which can be blank but must be present in the directory. This directory goes into another directory containing the readme and license, as well as an important file called setup.py. Example directory structure: SoloLearn/ LICENSE.txt README.txt setup.py sololearn/ __init__.py sololearn.py sololearn2.py You can place as many script files in the directory as you need. ----------------------------------------------------------------------------------------------------------------------- Packaging The next step in packaging is to write the setup.py file. This contains information necessary to assemble the package so it can be uploaded to PyPI and installed with pip (name, version, etc.). Example of a setup.py file: from distutils.core import setup setup( name='SoloLearn', version='0.1dev', packages=['sololearn',], license='MIT', long_description=open('README.txt').read(), ) After creating the setup.py file, upload it to PyPI, or use the command line to create a binary distribution (an executable installer). To build a source distribution, use the command line to navigate to the directory containing setup.py, and run the command python setup.py sdist. Run python setup.py bdist or, for Windows, python setup.py bdist_wininst to build a binary distribution. Use python setup.py register, followed by python setup.py sdist upload to upload a package. Finally, install a package with python setup.py install. ------------------------------------------------------------------------------------------------------------------------ Packaging The previous lesson covered packaging modules for use by other Python programmers. However, many computer users who are not programmers do not have Python installed. Therefore, it is useful to package scripts as executable files for the relevant platform, such as the Windows or Mac operating systems. This is not necessary for Linux, as most Linux users do have Python installed, and are able to run scripts as they are. For Windows, many tools are available for converting scripts to executables. For example, py2exe, can be used to package a Python script, along with the libraries it requires, into a single executable. PyInstaller and cx_Freeze serve the same purpose. For Macs, use py2app, PyInstaller or cx_Freeze. ----------------------------------------------------------------------------------------------------------------------- |
Page 1 of 1 | All times are UTC + 2 hours [ DST ] |
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |