Google公司的python编码规范指南

=====================================================

Table of Contents

1 Background

Python is the main dynamic language used at Google. This style guide is a list of dos and don’ts for Python programs.

To help you format code correctly, we’ve created a settings file for Vim. For Emacs, the default settings should be fine.

2 Python Language Rules

pylint is a tool for finding bugs and style problems in Python source code. It finds problems that are typically caught by a compiler for less dynamic languages like C and C++. Because of the dynamic nature of Python, some warnings may be incorrect; however, spurious warnings should be fairly infrequent.

Catches easy-to-miss errors like typos, using-vars-before-assignment, etc.

pylint isn’t perfect. To take advantage of it, sometimes we’ll need to write around it, suppress its warnings or fix it.

Make sure you run pylint on your code.

Suppress warnings if they are inappropriate so that other issues are not hidden. To suppress warnings, you can set a line-level comment:

pylint warnings are each identified by symbolic name (empty-docstring) Google-specific warnings start with g-.

If the reason for the suppression is not clear from the symbolic name, add an explanation.

Suppressing in this way has the advantage that we can easily search for suppressions and revisit them.

You can get a list of pylint warnings by doing:

To get more information on a particular message, use:

Prefer pylint: disable to the deprecated older form pylint: disable-msg.

Unused argument warnings can be suppressed by deleting the variables at the beginning of the function. Always include a comment explaining why you are deleting it. “Unused.” is sufficient. For example:

Other common forms of suppressing this warning include using ‘ _‘ as the identifier for the unused argument or prefixing the argument name with ‘ unused_‘, or assigning them to ‘ _‘. These forms are allowed but no longer encouraged. These break callers that pass arguments by name and do not enforce that the arguments are actually unused.

Use import statements for packages and modules only, not for individual classes or functions.

Reusability mechanism for sharing code from one module to another.

The namespace management convention is simple. The source of each identifier is indicated in a consistent way; x.Obj says that object Obj is defined in module x.

Module names can still collide. Some module names are inconveniently long.

  • Use import x for importing packages and modules.

  • Use from x import y where x is the package prefix and y is the module name with no prefix.

  • Use from x import y as z if two modules named y are to be imported, if y conflicts with a top-level name defined in the current module, or if y is an inconveniently long name.

  • Use import y as z only when z is a standard abbreviation (e.g., np for numpy).

For example the module sound.effects.echo may be imported as follows:

Do not use relative names in imports. Even if the module is in the same package, use the full package name. This helps prevent unintentionally importing a package twice.

Exemptions from this rule:

Import each module using the full pathname location of the module.

Avoids conflicts in module names or incorrect imports due to the module search path not being what the author expected. Makes it easier to find modules.

Makes it harder to deploy code because you have to replicate the package hierarchy. Not really a problem with modern deployment mechanisms.

All new code should import each module by its full package name.

Imports should be as follows:

(assume this file lives in doctor/who/ where jodie.py also exists)

The directory the main binary is located in should not be assumed to be in sys.path despite that happening in some environments. This being the case, code should assume that import jodie refers to a third party or top level package named jodie, not a local jodie.py.

Exceptions are allowed but must be used carefully.

Exceptions are a means of breaking out of normal control flow to handle errors or other exceptional conditions.

The control flow of normal operation code is not cluttered by error-handling code. It also allows the control flow to skip multiple frames when a certain condition occurs, e.g., returning from N nested functions in one step instead of having to plumb error codes through.

May cause the control flow to be confusing. Easy to miss error cases when making library calls.

Exceptions must follow certain conditions:

  • Make use of built-in exception classes when it makes sense. For example, raise a ValueError to indicate a programming mistake like a violated precondition (such as if you were passed a negative number but required a positive one). Do not use assert statements for validating argument values of a public API. assert is used to ensure internal correctness, not to enforce correct usage nor to indicate that some unexpected event occurred. If an exception is desired in the latter cases, use a raise statement. For example:
<span class="n">Yes<span class="p">:
  <span class="k">def <span class="nf">connect_to_next_port<span class="p">(<span class="bp">self<span class="p">, <span class="n">minimum<span class="p">: <span class="nb">int<span class="p">) <span class="o">-> <span class="nb">int<span class="p">:
    <span class="s">"""Connects to the next available port.

    Args:
      minimum: A port value greater or equal to 1024.

    Returns:
      The new minimum port.

    Raises:
      ConnectionError: If no available port is found.

"""
    <span class="k">if <span class="n">minimum <span class="o">< <span class="mi">1024<span class="p">:
      <span class="c1"># Note that this raising of ValueError is not mentioned in the doc
      <span class="c1"># string's "Raises:" section because it is not appropriate to
      <span class="c1"># guarantee this specific behavioral reaction to API misuse.

      <span class="k">raise <span class="nb">ValueError<span class="p">(<span class="sa">f<span class="s">'Min. port must be at least 1024, not <span class="si">{<span class="n">minimum<span class="si">}<span class="s">.'<span class="p">)
    <span class="n">port <span class="o">= <span class="bp">self<span class="p">.<span class="n">_find_next_open_port<span class="p">(<span class="n">minimum<span class="p">)
    <span class="k">if <span class="n">port <span class="ow">is <span class="bp">None<span class="p">:
      <span class="k">raise <span class="nb">ConnectionError<span class="p">(
          <span class="sa">f<span class="s">'Could not connect to service on port <span class="si">{<span class="n">minimum<span class="si">}<span class="s"> or higher.'<span class="p">)
    <span class="k">assert <span class="n">port <span class="o">>= <span class="n">minimum<span class="p">, <span class="p">(
        <span class="sa">f<span class="s">'Unexpected port <span class="si">{<span class="n">port<span class="si">}<span class="s"> when minimum was <span class="si">{<span class="n">minimum<span class="si">}<span class="s">.'<span class="p">)
    <span class="k">return <span class="n">port
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
<span class="n">No<span class="p">:
  <span class="k">def <span class="nf">connect_to_next_port<span class="p">(<span class="bp">self<span class="p">, <span class="n">minimum<span class="p">: <span class="nb">int<span class="p">) <span class="o">-> <span class="nb">int<span class="p">:
    <span class="s">"""Connects to the next available port.

    Args:
      minimum: A port value greater or equal to 1024.

    Returns:
      The new minimum port.

"""
    <span class="k">assert <span class="n">minimum <span class="o">>= <span class="mi">1024<span class="p">, <span class="s">'Minimum port must be at least 1024.'
    <span class="n">port <span class="o">= <span class="bp">self<span class="p">.<span class="n">_find_next_open_port<span class="p">(<span class="n">minimum<span class="p">)
    <span class="k">assert <span class="n">port <span class="ow">is <span class="ow">not <span class="bp">None
    <span class="k">return <span class="n">port
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
  • Libraries or packages may define their own exceptions. When doing so they must inherit from an existing exception class. Exception names should end in Error and should not introduce repetition (foo.FooError).

  • Never use catch-all except: statements, or catch Exception or StandardError, unless you are

  • re-raising the exception, or
  • creating an isolation point in the program where exceptions are not propagated but are recorded and suppressed instead, such as protecting a thread from crashing by guarding its outermost block. Python is very tolerant in this regard and except: will really catch everything including misspelled names, sys.exit() calls, Ctrl+C interrupts, unittest failures and all kinds of other exceptions that you simply don’t want to catch.

  • Minimize the amount of code in a try/ except block. The larger the body of the try, the more likely that an exception will be raised by a line of code that you didn’t expect to raise an exception. In those cases, the try/ except block hides a real error.

  • Use the finally clause to execute code whether or not an exception is raised in the try block. This is often useful for cleanup, i.e., closing a file.

Avoid global variables.

Variables that are declared at the module level or as class attributes.

Occasionally useful.

Has the potential to change module behavior during the import, because assignments to global variables are done when the module is first imported.

Avoid global variables.

If needed, global variables should be declared at the module level and made internal to the module by prepending an _ to the name. External access to global variables must be done through public module-level functions. See Naming below.

While module-level constants are technically variables, they are permitted and encouraged. For example: MAX_HOLY_HANDGRENADE_COUNT = 3. Constants must be named using all caps with underscores. See Naming below.

Nested local functions or classes are fine when used to close over a local variable. Inner classes are fine.

A class can be defined inside of a method, function, or class. A function can be defined inside a method or function. Nested functions have read-only access to variables defined in enclosing scopes.

Allows definition of utility classes and functions that are only used inside of a very limited scope. Very ADT-y. Commonly used for implementing decorators.

Nested functions and classes cannot be directly tested. Nesting can make the outer function longer and less readable.

They are fine with some caveats. Avoid nested functions or classes except when closing over a local value other than self or cls. Do not nest a function just to hide it from users of a module. Instead, prefix its name with an _ at the module level so that it can still be accessed by tests.

Okay to use for simple cases.

List, Dict, and Set comprehensions as well as generator expressions provide a concise and efficient way to create container types and iterators without resorting to the use of traditional loops, map(), filter(), or lambda.

Simple comprehensions can be clearer and simpler than other dict, list, or set creation techniques. Generator expressions can be very efficient, since they avoid the creation of a list entirely.

Complicated comprehensions or generator expressions can be hard to read.

Okay to use for simple cases. Each portion must fit on one line: mapping expression, for clause, filter expression. Multiple for clauses or filter expressions are not permitted. Use loops instead when things get more complicated.

Use default iterators and operators for types that support them, like lists, dictionaries, and files.

Container types, like dictionaries and lists, define default iterators and membership test operators (“in” and “not in”).

The default iterators and operators are simple and efficient. They express the operation directly, without extra method calls. A function that uses default operators is generic. It can be used with any type that supports the operation.

You can’t tell the type of objects by reading the method names (e.g. has_key() means a dictionary). This is also an advantage.

Use default iterators and operators for types that support them, like lists, dictionaries, and files. The built-in types define iterator methods, too. Prefer these methods to methods that return lists, except that you should not mutate a container while iterating over it.

Use generators as needed.

A generator function returns an iterator that yields a value each time it executes a yield statement. After it yields a value, the runtime state of the generator function is suspended until the next value is needed.

Simpler code, because the state of local variables and control flow are preserved for each call. A generator uses less memory than a function that creates an entire list of values at once.

Local variables in the generator will not be garbage collected until the generator is either consumed to exhaustion or itself garbage collected.

Fine. Use “Yields:” rather than “Returns:” in the docstring for generator functions.

A good way to do the clean up is by wrapping the generator with a context manager PEP-0533.

Okay for one-liners. Prefer generator expressions over map() or filter() with a lambda.

Lambdas define anonymous functions in an expression, as opposed to a statement.

Convenient.

Harder to read and debug than local functions. The lack of names means stack traces are more difficult to understand. Expressiveness is limited because the function may only contain an expression.

Okay to use them for one-liners. If the code inside the lambda function is longer than 60-80 chars, it’s probably better to define it as a regular nested function.

For common operations like multiplication, use the functions from the operator module instead of lambda functions. For example, prefer operator.mul to lambda x, y: x * y.

Okay for simple cases.

Conditional expressions (sometimes called a “ternary operator”) are mechanisms that provide a shorter syntax for if statements. For example: x = 1 if cond else 2.

Shorter and more convenient than an if statement.

May be harder to read than an if statement. The condition may be difficult to locate if the expression is long.

Okay to use for simple cases. Each portion must fit on one line: true-expression, if-expression, else-expression. Use a complete if statement when things get more complicated.

Okay in most cases.

You can specify values for variables at the end of a function’s parameter list, e.g., def foo(a, b=0):. If foo is called with only one argument, b is set to 0. If it is called with two arguments, b has the value of the second argument.

Often you have a function that uses lots of default values, but on rare occasions you want to override the defaults. Default argument values provide an easy way to do this, without having to define lots of functions for the rare exceptions. As Python does not support overloaded methods/functions, default arguments are an easy way of “faking” the overloading behavior.

Default arguments are evaluated once at module load time. This may cause problems if the argument is a mutable object such as a list or a dictionary. If the function modifies the object (e.g., by appending an item to a list), the default value is modified.

Okay to use with the following caveat:

Do not use mutable objects as default values in the function or method definition.

Properties may be used to control getting or setting attributes that require trivial computations or logic. Property implementations must match the general expectations of regular attribute access: that they are cheap, straightforward, and unsurprising.

A way to wrap method calls for getting and setting an attribute as a standard attribute access.

  • Allows for an attribute access and assignment API rather than getter and setter method calls.

  • Can be used to make an attribute read-only.

  • Allows calculations to be lazy.

  • Provides a way to maintain the public interface of a class when the internals evolve independently of class users.

  • Can hide side-effects much like operator overloading.

  • Can be confusing for subclasses.

Properties are allowed, but, like operator overloading, should only be used when necessary and match the expectations of typical attribute access; follow the getters and setters rules otherwise.

For example, using a property to simply both get and set an internal attribute isn’t allowed: there is no computation occurring, so the property is unnecessary (make the attribute public instead). In comparison, using a property to control attribute access or to calculate a trivially derived value is allowed: the logic is simple and unsurprising.

Properties should be created with the @property decorator. Manually implementing a property descriptor is considered a power feature.

Inheritance with properties can be non-obvious. Do not use properties to implement computations a subclass may ever want to override and extend.

Use the “implicit” false if at all possible.

Python evaluates certain values as False when in a boolean context. A quick “rule of thumb” is that all “empty” values are considered false, so 0, None, [], {}, '' all evaluate as false in a boolean context.

Conditions using Python booleans are easier to read and less error-prone. In most cases, they’re also faster.

May look strange to C/C++ developers.

Use the “implicit” false if possible, e.g., if foo: rather than if foo != []:. There are a few caveats that you should keep in mind though:

  • Always use if foo is None: (or is not None) to check for a None value. E.g., when testing whether a variable or argument that defaults to None was set to some other value. The other value might be a value that’s false in a boolean context!

  • Never compare a boolean variable to False using ==. Use if not x: instead. If you need to distinguish False from None then chain the expressions, such as if not x and x is not None:.

  • For sequences (strings, lists, tuples), use the fact that empty sequences are false, so if seq: and if not seq: are preferable to if len(seq): and if not len(seq): respectively.

  • When handling integers, implicit false may involve more risk than benefit (i.e., accidentally handling None as 0). You may compare a value which is known to be an integer (and is not the result of len()) against the integer 0.

<span class="n">Yes<span class="p">: <span class="k">if <span class="ow">not <span class="n">users<span class="p">:
         <span class="k">print<span class="p">(<span class="s">'no users'<span class="p">)

     <span class="k">if <span class="n">i <span class="o">% <span class="mi">10 <span class="o">== <span class="mi">0<span class="p">:
         <span class="bp">self<span class="p">.<span class="n">handle_multiple_of_ten<span class="p">()

     <span class="k">def <span class="nf">f<span class="p">(<span class="n">x<span class="o">=<span class="bp">None<span class="p">):
         <span class="k">if <span class="n">x <span class="ow">is <span class="bp">None<span class="p">:
             <span class="n">x <span class="o">= <span class="p">[]
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
<span class="n">No<span class="p">:  <span class="k">if <span class="nb">len<span class="p">(<span class="n">users<span class="p">) <span class="o">== <span class="mi">0<span class="p">:
         <span class="k">print<span class="p">(<span class="s">'no users'<span class="p">)

     <span class="k">if <span class="ow">not <span class="n">i <span class="o">% <span class="mi">10<span class="p">:
         <span class="bp">self<span class="p">.<span class="n">handle_multiple_of_ten<span class="p">()

     <span class="k">def <span class="nf">f<span class="p">(<span class="n">x<span class="o">=<span class="bp">None<span class="p">):
         <span class="n">x <span class="o">= <span class="n">x <span class="ow">or <span class="p">[]
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
  • Note that '0' (i.e., 0 as string) evaluates to true.

  • Note that Numpy arrays may raise an exception in an implicit boolean context. Prefer the .size attribute when testing emptiness of a np.array (e.g. if not users.size).

Okay to use.

A nested Python function can refer to variables defined in enclosing functions, but cannot assign to them. Variable bindings are resolved using lexical scoping, that is, based on the static program text. Any assignment to a name in a block will cause Python to treat all references to that name as a local variable, even if the use precedes the assignment. If a global declaration occurs, the name is treated as a global variable.

An example of the use of this feature is:

Often results in clearer, more elegant code. Especially comforting to experienced Lisp and Scheme (and Haskell and ML and …) programmers.

So foo([1, 2, 3]) will print 1 2 3 3, not 1 2 3 4.

Okay to use.

Use decorators judiciously when there is a clear advantage. Avoid staticmethod and limit use of classmethod.

Decorators for Functions and Methods (a.k.a “the @ notation”). One common decorator is @property, used for converting ordinary methods into dynamically computed attributes. However, the decorator syntax allows for user-defined decorators as well. Specifically, for some function my_decorator, this:

is equivalent to:

Elegantly specifies some transformation on a method; the transformation might eliminate some repetitive code, enforce invariants, etc.

Decorators can perform arbitrary operations on a function’s arguments or return values, resulting in surprising implicit behavior. Additionally, decorators execute at object definition time. For module-level objects (classes, module functions, …) this happens at import time. Failures in decorator code are pretty much impossible to recover from.

Use decorators judiciously when there is a clear advantage. Decorators should follow the same import and naming guidelines as functions. Decorator pydoc should clearly state that the function is a decorator. Write unit tests for decorators.

Avoid external dependencies in the decorator itself (e.g. don’t rely on files, sockets, database connections, etc.), since they might not be available when the decorator runs (at import time, perhaps from pydoc or other tools). A decorator that is called with valid parameters should (as much as possible) be guaranteed to succeed in all cases.

Decorators are a special case of “top level code” – see main for more discussion.

Never use staticmethod unless forced to in order to integrate with an API defined in an existing library. Write a module level function instead.

Use classmethod only when writing a named constructor or a class-specific routine that modifies necessary global state such as a process-wide cache.

Do not rely on the atomicity of built-in types.

While Python’s built-in data types such as dictionaries appear to have atomic operations, there are corner cases where they aren’t atomic (e.g. if __hash__ or __eq__ are implemented as Python methods) and their atomicity should not be relied upon. Neither should you rely on atomic variable assignment (since this in turn depends on dictionaries).

Use the Queue module’s Queue data type as the preferred way to communicate data between threads. Otherwise, use the threading module and its locking primitives. Prefer condition variables and threading.Condition instead of using lower-level locks.

Avoid these features.

Python is an extremely flexible language and gives you many fancy features such as custom metaclasses, access to bytecode, on-the-fly compilation, dynamic inheritance, object reparenting, import hacks, reflection (e.g. some uses of getattr()), modification of system internals, __del__ methods implementing customized cleanup, etc.

These are powerful language features. They can make your code more compact.

It’s very tempting to use these “cool” features when they’re not absolutely necessary. It’s harder to read, understand, and debug code that’s using unusual features underneath. It doesn’t seem that way at first (to the original author), but when revisiting the code, it tends to be more difficult than code that is longer but is straightforward.

Avoid these features in your code.

Standard library modules and classes that internally use these features are okay to use (for example, abc.ABCMeta, dataclasses, and enum).

New language version semantic changes may be gated behind a special future import to enable them on a per-file basis within earlier runtimes.

Being able to turn on some of the more modern features via from __future__ import statements allows early use of features from expected future Python versions.

This has proven to make runtime version upgrades smoother as changes can be made on a per-file basis while declaring compatibility and preventing regressions within those files. Modern code is more maintainable as it is less likely to accumulate technical debt that will be problematic during future runtime upgrades.

Such code may not work on very old interpreter versions prior to the introduction of the needed future statement. The need for this is more common in projects supporting an extremely wide variety of environments.

Use of from __future__ import statements is encouraged. It allows a given source file to start using more modern Python syntax features today. Once you no longer need to run on a version where the features are hidden behind a __future__ import, feel free to remove those lines.

In code that may execute on versions as old as 3.5 rather than >= 3.7, import:

Please don’t remove these imports until you are confident the code is only ever used in a sufficiently modern environment. Even if you do not currently use the feature a specific future import enables in your code today, keeping it in place in the file prevents later modifications of the code from inadvertently depending on the older behavior.

Use other from __future__ import statements as you see fit.

You can annotate Python code with type hints according to PEP-484, and type-check the code at build time with a type checking tool like pytype.

Type annotations can be in the source or in a stub pyi file. Whenever possible, annotations should be in the source. Use pyi files for third-party or extension modules.

Type annotations (or “type hints”) are for function or method arguments and return values:

Type annotations improve the readability and maintainability of your code. The type checker will convert many runtime errors to build-time errors, and reduce your ability to use Power Features.

You will have to keep the type declarations up to date. You might see type errors that you think are valid code. Use of a type checker may reduce your ability to use Power Features.

You are strongly encouraged to enable Python type analysis when updating code. When adding or modifying public APIs, include type annotations and enable checking via pytype in the build system. As static analysis is relatively new to Python, we acknowledge that undesired side-effects (such as wrongly inferred types) may prevent adoption by some projects. In those situations, authors are encouraged to add a comment with a TODO or link to a bug describing the issue(s) currently preventing type annotation adoption in the BUILD file or in the code itself as appropriate.

3 Python Style Rules

Do not terminate your lines with semicolons, and do not use semicolons to put two statements on the same line.

Maximum line length is 80 characters.

Explicit exceptions to the 80 character limit:

  • Long import statements.

  • URLs, pathnames, or long flags in comments.

  • Long string module level constants not containing whitespace that would be inconvenient to split across lines such as URLs or pathnames.

  • Pylint disable comments. (e.g.: # pylint: disable=invalid-name)

Do not use backslash line continuation except for with statements requiring three or more context managers.

When a literal string won’t fit on a single line, use parentheses for implicit line joining.

Within comments, put long URLs on their own line if necessary.

It is permissible to use backslash continuation when defining a with statement whose expressions span three or more lines. For two lines of expressions, use a nested with statement:

Make note of the indentation of the elements in the line continuation examples above; see the indentation section for explanation.

In all other cases where a line exceeds 80 characters, and the yapf auto-formatter does not help bring the line below the limit, the line is allowed to exceed this maximum. Authors are encouraged to manually break the line up per the notes above when it is sensible.

Use parentheses sparingly.

It is fine, though not required, to use parentheses around tuples. Do not use them in return statements or conditional statements unless using parentheses for implied line continuation or to indicate a tuple.

Indent your code blocks with 4 spaces.

Never use tabs or mix tabs and spaces. In cases of implied line continuation, you should align wrapped elements either vertically, as per the examples in the line length section; or using a hanging indent of 4 spaces, in which case there should be nothing after the open parenthesis or bracket on the first line.

Trailing commas in sequences of items are recommended only when the closing container token ], ), or } does not appear on the same line as the final element. The presence of a trailing comma is also used as a hint to our Python code auto-formatter YAPF to direct it to auto-format the container of items to one item per line when the , after the final element is present.

Two blank lines between top-level definitions, be they function or class definitions. One blank line between method definitions and between the class line and the first method. No blank line following a def line. Use single blank lines as you judge appropriate within functions or methods.

Blank lines need not be anchored to the definition. For example, related comments immediately preceding function, class, and method definitions can make sense. Consider if your comment might be more useful as part of the docstring.

Follow standard typographic rules for the use of spaces around punctuation.

No whitespace inside parentheses, brackets or braces.

No whitespace before a comma, semicolon, or colon. Do use whitespace after a comma, semicolon, or colon, except at the end of the line.

No whitespace before the open paren/bracket that starts an argument list, indexing or slicing.

No trailing whitespace.

Surround binary operators with a single space on either side for assignment (=), comparisons (==, <,>, !=, <>, <=,>=, in, not in, is, is not<!--=,--><!--,-->), and Booleans (and, or, not). Use your better judgment for the insertion of spaces around arithmetic operators (+, -, *, /, //, %, **, @).

Never use spaces around = when passing keyword arguments or defining a default parameter value, with one exception: when a type annotation is present, do use spaces around the = for the default parameter value.

Don’t use spaces to vertically align tokens on consecutive lines, since it becomes a maintenance burden (applies to :, #, =, etc.):

Most .py files do not need to start with a #! line. Start the main file of a program with #!/usr/bin/env python3 (to support virtualenvs) or #!/usr/bin/python3 per PEP-394.

This line is used by the kernel to find the Python interpreter, but is ignored by Python when importing modules. It is only necessary on a file intended to be executed directly.

Be sure to use the right style for module, function, method docstrings and inline comments.

Python uses docstrings to document code. A docstring is a string that is the first statement in a package, module, class or function. These strings can be extracted automatically through the __doc__ member of the object and are used by pydoc. (Try running pydoc on your module to see how it looks.) Always use the three double-quote """ format for docstrings (per PEP 257). A docstring should be organized as a summary line (one physical line not exceeding 80 characters) terminated by a period, question mark, or exclamation point. When writing more (encouraged), this must be followed by a blank line, followed by the rest of the docstring starting at the same cursor position as the first quote of the first line. There are more formatting guidelines for docstrings below.

Every file should contain license boilerplate. Choose the appropriate boilerplate for the license used by the project (for example, Apache 2.0, BSD, LGPL, GPL)

Files should start with a docstring describing the contents and usage of the module.

In this section, “function” means a method, function, or generator.

  • not externally visible
  • very short
  • obvious

A docstring should give enough information to write a call to the function without reading the function’s code. The docstring should describe the function’s calling syntax and its semantics, but generally not its implementation details, unless those details are relevant to how the function is to be used. For example, a function that mutates one of its arguments as a side effect should note that in its docstring. Otherwise, subtle but important details of a function’s implementation that are not relevant to the caller are better expressed as comments alongside the code than within the function’s docstring.

The docstring should be descriptive-style ("""Fetches rows from a Bigtable.""") rather than imperative-style ("""Fetch rows from a Bigtable."""). The docstring for a @property data descriptor should use the same style as the docstring for an attribute or a function argument ("""The Bigtable path.""", rather than """Returns the Bigtable path.""").

A method that overrides a method from a base class may have a simple docstring sending the reader to its overridden method’s docstring, such as """See base class.""". The rationale is that there is no need to repeat in many places documentation that is already present in the base method’s docstring. However, if the overriding method’s behavior is substantially different from the overridden method, or details need to be provided (e.g., documenting additional side effects), a docstring with at least those differences is required on the overriding method.

Certain aspects of a function should be documented in special sections, listed below. Each section begins with a heading line, which ends with a colon. All sections other than the heading should maintain a hanging indent of two or four spaces (be consistent within a file). These sections can be omitted in cases where the function’s name and signature are informative enough that it can be aptly described using a one-line docstring.

List each parameter by name. A description should follow the name, and be separated by a colon followed by either a space or newline. If the description is too long to fit on a single 80-character line, use a hanging indent of 2 or 4 spaces more than the parameter name (be consistent with the rest of the docstrings in the file). The description should include required type(s) if the code does not contain a corresponding type annotation. If a function accepts

(variable length argument lists) and/or

(arbitrary keyword arguments), they should be listed as

and

Describe the type and semantics of the return value. If the function only returns None, this section is not required. It may also be omitted if the docstring starts with Returns or Yields (e.g.

) and the opening sentence is sufficient to describe the return value. Do not imitate ‘NumPy style’ (

), which frequently documents a tuple return value as if it were multiple return values with individual names (never mentioning the tuple). Instead, describe such a return value as: “Returns: A tuple (mat_a, mat_b), where mat_a is …, and …”. The auxiliary names in the docstring need not necessarily correspond to any internal names used in the function body (as those are not part of the API).

List all exceptions that are relevant to the interface followed by a description. Use a similar exception name + colon + space or newline and hanging indent style as described in

. You should not document exceptions that get raised if the API specified in the docstring is violated (because this would paradoxically make behavior under violation of the API part of the API).

Similarly, this variation on Args: with a line break is also allowed:

Classes should have a docstring below the class definition describing the class. If your class has public attributes, they should be documented here in an Attributes section and follow the same formatting as a function’s Args section.

All class docstrings should start with a one-line summary that describes what the class instance represents. This implies that subclasses of Exception should also describe what the exception represents, and not the context in which it might occur. The class docstring should not repeat unnecessary information, such as that the class is a class.

The final place to have comments is in tricky parts of the code. If you’re going to have to explain it at the next code review, you should comment it now. Complicated operations get a few lines of comments before the operations commence. Non-obvious ones get comments at the end of the line.

To improve legibility, these comments should start at least 2 spaces away from the code with the comment character #, followed by at least one space before the text of the comment itself.

On the other hand, never describe the code. Assume the person reading the code knows Python (though not what you’re trying to do) better than you do.

Pay attention to punctuation, spelling, and grammar; it is easier to read well-written comments than badly written ones.

Comments should be as readable as narrative text, with proper capitalization and punctuation. In many cases, complete sentences are more readable than sentence fragments. Shorter comments, such as comments at the end of a line of code, can sometimes be less formal, but you should be consistent with your style.

Although it can be frustrating to have a code reviewer point out that you are using a comma when you should be using a semicolon, it is very important that source code maintain a high level of clarity and readability. Proper punctuation, spelling, and grammar help with that goal.

Use an f-string, the % operator, or the format method for formatting strings, even when the parameters are all strings. Use your best judgment to decide between + and string formatting.

Avoid using the + and += operators to accumulate a string within a loop. In some conditions, accumulating a string with addition can lead to quadratic rather than linear running time. Although common accumulations of this sort may be optimized on CPython, that is an implementation detail. The conditions under which an optimization applies are not easy to predict and may change. Instead, add each substring to a list and ''.join the list after the loop terminates, or write each substring to an io.StringIO buffer. These techniques consistently have amortized-linear run time complexity.

Be consistent with your choice of string quote character within a file. Pick ' or " and stick with it. It is okay to use the other quote character on a string to avoid the need to backslash-escape quote characters within the string.

Prefer """ for multi-line strings rather than '''. Projects may choose to use ''' for all non-docstring multi-line strings if and only if they also use ' for regular strings. Docstrings must use """ regardless.

Multi-line strings do not flow with the indentation of the rest of the program. If you need to avoid embedding extra space in the string, use either concatenated single-line strings or a multi-line string with textwrap.dedent() to remove the initial space on each line:

For logging functions that expect a pattern-string (with %-placeholders) as their first argument: Always call them with a string literal (not an f-string!) as their first argument with pattern-parameters as subsequent arguments. Some logging implementations collect the unexpanded pattern-string as a queryable field. It also prevents spending time rendering a message that no logger is configured to output.

Error messages (such as: message strings on exceptions like ValueError, or messages shown to the user) should follow three guidelines:

Explicitly close files and sockets when done with them. This rule naturally extends to closeable resources that internally use sockets, such as database connections, and also other resources that need to be closed down in a similar fashion. To name only a few examples, this also includes mmap mappings, h5py File objects, and matplotlib.pyplot figure windows.

Leaving files, sockets or other such stateful objects open unnecessarily has many downsides:

  • They may consume limited system resources, such as file descriptors. Code that deals with many such objects may exhaust those resources unnecessarily if they’re not returned to the system promptly after use.

  • Holding files open may prevent other actions such as moving or deleting them, or unmounting a filesystem.

  • Files and sockets that are shared throughout a program may inadvertently be read from or written to after logically being closed. If they are actually closed, attempts to read or write from them will raise exceptions, making the problem known sooner.

Furthermore, while files and sockets (and some similarly behaving resources) are automatically closed when the object is destructed, coupling the lifetime of the object to the state of the resource is poor practice:

  • There are no guarantees as to when the runtime will actually invoke the __del__ method. Different Python implementations use different memory management techniques, such as delayed garbage collection, which may increase the object’s lifetime arbitrarily and indefinitely.

  • Unexpected references to the file, e.g. in globals or exception tracebacks, may keep it around longer than intended.

Relying on finalizers to do automatic cleanup that has observable side effects has been rediscovered over and over again to lead to major problems, across many decades and multiple languages (see e.g. this article for Java).

The preferred way to manage files and similar resources is using the with statement:

For file-like objects that do not support the with statement, use contextlib.closing():

In rare cases where context-based resource management is infeasible, code documentation must explain clearly how resource lifetime is managed.

Use TODO comments for code that is temporary, a short-term solution, or good-enough but not perfect.

A TODO comment begins with the string TODO in all caps and a parenthesized name, e-mail address, or other identifier of the person or issue with the best context about the problem. This is followed by an explanation of what there is to do.

The purpose is to have a consistent TODO format that can be searched to find out how to get more details. A TODO is not a commitment that the person referenced will fix the problem. Thus when you create a TODO, it is almost always your name that is given.

If your TODO is of the form “At a future date do something” make sure that you either include a very specific date (“Fix by November 2009”) or a very specific event (“Remove this code when all clients can handle XML responses.”).

E.g.:

Imports are always put at the top of the file, just after any module comments and docstrings and before module globals and constants. Imports should be grouped from most generic to least generic:

Within each grouping, imports should be sorted lexicographically, ignoring case, according to each module’s full package path (the path in from path import ...). Code may optionally place a blank line between import sections.

Generally only one statement per line.

However, you may put the result of a test on the same line as the test only if the entire statement fits on one line. In particular, you can never do so with try/ except since the try and except can’t both fit on the same line, and you can only do so with an if if there is no else.

Getter and setter functions (also called accessors and mutators) should be used when they provide a meaningful role or behavior for getting or setting a variable’s value.

In particular, they should be used when getting or setting the variable is complex or the cost is significant, either currently or in a reasonable future.

If, for example, a pair of getters/setters simply read and write an internal attribute, the internal attribute should be made public instead. By comparison, if setting a variable means some state is invalidated or rebuilt, it should be a setter function. The function invocation hints that a potentially non-trivial operation is occurring. Alternatively, properties may be an option when simple logic is needed, or refactoring to no longer need getters and setters.

Getters and setters should follow the Naming guidelines, such as get_foo() and set_foo().

If the past behavior allowed access through a property, do not bind the new getter/setter functions to the property. Any code still attempting to access the variable by the old method should break visibly so they are made aware of the change in complexity.

module_name, package_name, ClassName, method_name, ExceptionName, function_name, GLOBAL_CONSTANT_NAME, global_var_name, instance_var_name, function_parameter_name, local_var_name, query_proper_noun_for_thing, send_acronym_via_https.

Function names, variable names, and filenames should be descriptive; eschew abbreviation. In particular, do not use abbreviations that are ambiguous or unfamiliar to readers outside your project, and do not abbreviate by deleting letters within a word.

Always use a .py filename extension. Never use dashes.

  • single character names, except for specifically allowed cases:
  • counters or iterators (e.g. i, j, k, v, et al.)
  • e as an exception identifier in try/except statements.

  • f as a file handle in with statements

  • private TypeVars with no constraints (e.g. _T, _U, _V) Please be mindful not to abuse single-character naming. Generally speaking, descriptiveness should be proportional to the name’s scope of visibility. For example, i might be a fine name for 5-line code block but within multiple nested scopes, it is likely too vague.

  • dashes (-) in any package/module name

  • __double_leading_and_trailing_underscore__ names (reserved by Python)
  • offensive terms
  • names that needlessly include the type of the variable (for example: id_to_name_dict)

  • “Internal” means internal to a module, or protected or private within a class.

  • Prepending a single underscore (_) has some support for protecting module variables and functions (linters will flag protected member access).

  • Prepending a double underscore (__ aka “dunder”) to an instance variable or method effectively makes the variable or method private to its class (using name mangling); we discourage its use as it impacts readability and testability, and isn’t really private. Prefer a single underscore.

  • Place related classes and top-level functions together in a module. Unlike Java, there is no need to limit yourself to one class per module.

  • Use CapWords for class names, but lower_with_under.py for module names. Although there are some old modules named CapWords.py, this is now discouraged because it’s confusing when the module happens to be named after a class. (“wait – did I write import StringIO or from StringIO import StringIO?”)

  • Underscores may appear in unittest method names starting with test to separate logical components of the name, even if those components use CapWords. One possible pattern is test<methodundertest>_<state></state></methodundertest>; for example testPop_EmptyStack is okay. There is no One Correct Way to name test methods.

Python filenames must have a .py extension and must not contain dashes (-). This allows them to be imported and unittested. If you want an executable to be accessible without the extension, use a symbolic link or a simple bash wrapper containing exec "$0.py" "$@".

TypePublicInternal Packages

Modules

Classes

Exceptions

Functions

Global/Class Constants

Global/Class Variables

Instance Variables

(protected) Method Names

(protected) Function/Method Parameters

Local Variables

For mathematically heavy code, short variable names that would otherwise violate the style guide are preferred when they match established notation in a reference paper or algorithm. When doing so, reference the source of all naming conventions in a comment or docstring or, if the source is not accessible, clearly document the naming conventions. Prefer PEP8-compliant descriptive_names for public APIs, which are much more likely to be encountered out of context.

In Python, pydoc as well as unit tests require modules to be importable. If a file is meant to be used as an executable, its main functionality should be in a main() function, and your code should always check if __name__ == '__main__' before executing your main program, so that it is not executed when the module is imported.

Otherwise, use:

All code at the top level will be executed when the module is imported. Be careful not to call functions, create objects, or perform other operations that should not be executed when the file is being pydoced.

Prefer small and focused functions.

We recognize that long functions are sometimes appropriate, so no hard limit is placed on function length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.

Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code.

You could find long and complicated functions when working with some code. Do not be intimidated by modifying existing code: if working with such a function proves to be difficult, you find that errors are hard to debug, or you want to use a piece of it in several different contexts, consider breaking up the function into smaller and more manageable pieces.

  • Familiarize yourself with PEP-484.

  • In methods, only annotate self, or cls if it is necessary for proper type information. e.g.,

<span class="o">@<span class="nb">classmethod
<span class="k">def <span class="nf">create<span class="p">(<span class="n">cls<span class="p">: <span class="n">Type<span class="p">[<span class="n">T<span class="p">]) <span class="o">-> <span class="n">T<span class="p">:
  <span class="k">return <span class="n">cls<span class="p">()
</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>
  • Similarly, don’t feel compelled to annotate the return value of __init__ (where None is the only valid option).

  • If any other variable or a returned type should not be expressed, use Any.

  • You are not required to annotate all the functions in a module.

  • At least annotate your public APIs.

  • Use judgment to get to a good balance between safety and clarity on the one hand, and flexibility on the other.

  • Annotate code that is prone to type-related errors (previous bugs or complexity).

  • Annotate code that is hard to understand.

  • Annotate code as it becomes stable from a types perspective. In many cases, you can annotate all the functions in mature code without losing too much flexibility.

Always prefer breaking between variables, and not, for example, between variable names and type annotations. However, if everything fits on the same line, go for it.

If the combination of the function name, the last parameter, and the return type is too long, indent by 4 in a new line.

When the return type does not fit on the same line as the last parameter, the preferred way is to indent the parameters by 4 on a new line and align the closing parenthesis with the def.

pylint allows you to move the closing parenthesis to a new line and align with the opening one, but this is less readable.

As in the examples above, prefer not to break types. However, sometimes they are too long to be on a single line (try to keep sub-types unbroken).

If a single name and type is too long, consider using an alias for the type. The last resort is to break after the colon and indent by 4.

If you need to use a class name from the same module that is not yet defined – for example, if you need the class inside the class declaration, or if you use a class that is defined below – either use from __future__ import annotations for simple cases or use a string for the class name.

As per PEP-008, use spaces around the = only for arguments that have both a type annotation and a default value.

In the Python type system, NoneType is a “first class” type, and for typing purposes, None is an alias for NoneType. If an argument can be None, it has to be declared! You can use Union, but if there is only one other type, use Optional.

Use explicit Optional instead of implicit Optional. Earlier versions of PEP 484 allowed a: str = None to be interpreted as a: Optional[str] = None, but that is no longer the preferred behavior.

You can declare aliases of complex types. The name of an alias should be CapWorded. If the alias is used only in this module, it should be _Private.

For example, if the name of the module together with the name of the type is too long:

Other examples are complex nested types and multiple return variables from a function (as a tuple).

pytype has a disable option for specific errors (similar to lint):

If an internal variable has a type that is hard or impossible to infer, specify its type with an annotated assignment – use a colon and type between the variable name and value (the same as is done with function arguments that have a default value):

Though you may see them remaining in the codebase (they were necessary before Python 3.6), do not add any more uses of a

comment on the end of the line:

Typed lists can only contain objects of a single type. Typed tuples can either have a single repeated type or a set number of elements with different types. The latter is commonly used as the return type from a function.

The Python type system has generics. The factory function TypeVar is a common way to use them.

Example:

A TypeVar can be constrained:

A common predefined type variable in the typing module is AnyStr. Use it for multiple annotations that can be bytes or str and must all be the same type.

A TypeVar must have a descriptive name, unless it meets all of the following criteria:

  • not externally visible
  • not constrained

Do not use typing.Text in new code. It’s only for Python 2/3 compatibility.

Use str for string/text data. For code that deals with binary data, use bytes.

If all the string types of a function are always the same, for example if the return type is the same as the argument type in the code above, use AnyStr.

For symbols from the typing and collections.abc modules used to support static analysis and type checking, always import the symbol itself. This keeps common annotations more concise and matches typing practices used around the world. You are explicitly allowed to import multiple specific classes on one line from the typing and collections.abc modules. Ex:

Given that this way of importing adds items to the local namespace, names in typing or collections.abc should be treated similarly to keywords, and not be defined in your Python code, typed or not. If there is a collision between a type and an existing name in a module, import it using import x as y.

Use conditional imports only in exceptional cases where the additional imports needed for type checking must be avoided at runtime. This pattern is discouraged; alternatives such as refactoring the code to allow top level imports should be preferred.

Imports that are needed only for type annotations can be placed within an if TYPE_CHECKING: block.

  • Conditionally imported types need to be referenced as strings, to be forward compatible with Python 3.6 where the annotation expressions are actually evaluated.

  • Only entities that are used solely for typing should be defined here; this includes aliases. Otherwise it will be a runtime error, as the module will not be imported at runtime.

  • The block should be right after all the normal imports.

  • There should be no empty lines in the typing imports list.

  • Sort this list as if it were a regular imports list.

<span class="kn">import <span class="nn">typing
<span class="k">if <span class="n">typing<span class="p">.<span class="n">TYPE_CHECKING<span class="p">:
  <span class="kn">import <span class="nn">sketch
<span class="k">def <span class="nf">f<span class="p">(<span class="n">x<span class="p">: <span class="s">"sketch.Sketch"<span class="p">): <span class="p">...

</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span>

Circular dependencies that are caused by typing are code smells. Such code is a good candidate for refactoring. Although technically it is possible to keep circular dependencies, various build systems will not let you do so because each module has to depend on the other.

Replace modules that create circular dependency imports with Any. Set an alias with a meaningful name, and use the real type name from this module (any attribute of Any is Any). Alias definitions should be separated from the last import by one line.

If the best type parameter for a generic is Any, make it explicit, but remember that in many cases TypeVar might be more appropriate:

4 Parting Words

BE CONSISTENT.

If you’re editing code, take a few minutes to look at the code around you and determine its style. If they use spaces around all their arithmetic operators, you should too. If their comments have little boxes of hash marks around them, make your comments have little boxes of hash marks around them too.

The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you’re saying rather than on how you’re saying it. We present global style rules here so people know the vocabulary, but local style is also important. If code you add to a file looks drastically different from the existing code around it, it throws readers out of their rhythm when they go to read it. Avoid this.

==================================

Original: https://www.cnblogs.com/devilmaycry812839668/p/16441143.html
Author: Death_Knight
Title: Google公司的python编码规范指南

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/7362/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部