cmd2.parsing

Classes for parsing and storing user input.

class cmd2.parsing.StatementParser(terminators: Optional[Iterable[str]] = None, multiline_commands: Optional[Iterable[str]] = None, aliases: Optional[Dict[str, str]] = None, shortcuts: Optional[Dict[str, str]] = None)

Parse user input as a string into discrete command components.

__init__(terminators: Optional[Iterable[str]] = None, multiline_commands: Optional[Iterable[str]] = None, aliases: Optional[Dict[str, str]] = None, shortcuts: Optional[Dict[str, str]] = None) → None

Initialize an instance of StatementParser.

The following will get converted to an immutable tuple before storing internally: terminators, multiline commands, and shortcuts.

Parameters:
  • terminators – iterable containing strings which should terminate commands
  • multiline_commands – iterable containing the names of commands that accept multiline input
  • aliases – dictionary containing aliases
  • shortcuts – dictionary containing shortcuts
get_command_arg_list(command_name: str, to_parse: Union[cmd2.parsing.Statement, str], preserve_quotes: bool) → Tuple[cmd2.parsing.Statement, List[str]]

Convenience method used by the argument parsing decorators.

Retrieves just the arguments being passed to their do_* methods as a list.

Parameters:
  • command_name – name of the command being run
  • to_parse

    what is being passed to the do_* method. It can be one of two types:

    1. An already parsed Statement
    2. An argument string in cases where a do_* method is explicitly called. Calling do_help('alias create') would cause to_parse to be ‘alias create’.

      In this case, the string will be converted to a Statement and returned along with the argument list.

  • preserve_quotes – if True, then quotes will not be stripped from the arguments
Returns:

A tuple containing the Statement and a list of strings representing the arguments

is_valid_command(word: str, *, is_subcommand: bool = False) → Tuple[bool, str]

Determine whether a word is a valid name for a command.

Commands can not include redirection characters, whitespace, or termination characters. They also cannot start with a shortcut.

Parameters:
  • word – the word to check as a command
  • is_subcommand – Flag whether this command name is a subcommand name
Returns:

a tuple of a boolean and an error string

If word is not a valid command, return False and an error string suitable for inclusion in an error message of your choice:

checkit = '>'
valid, errmsg = statement_parser.is_valid_command(checkit)
if not valid:
    errmsg = "alias: {}".format(errmsg)
parse(line: str) → cmd2.parsing.Statement

Tokenize the input and parse it into a Statement object, stripping comments, expanding aliases and shortcuts, and extracting output redirection directives.

Parameters:line – the command line being parsed
Returns:a new Statement object
Raises:Cmd2ShlexError if a shlex error occurs (e.g. No closing quotation)
parse_command_only(rawinput: str) → cmd2.parsing.Statement

Partially parse input into a Statement object.

The command is identified, and shortcuts and aliases are expanded. Multiline commands are identified, but terminators and output redirection are not parsed.

This method is used by tab completion code and therefore must not generate an exception if there are unclosed quotes.

The Statement object returned by this method can at most contain values in the following attributes: args, raw, command, multiline_command

args will include all output redirection clauses and command terminators.

Different from parse() this method does not remove redundant whitespace within args. However, it does ensure args has no leading or trailing whitespace.

Parameters:rawinput – the command line as entered by the user
Returns:a new Statement object
split_on_punctuation(tokens: List[str]) → List[str]

Further splits tokens from a command line using punctuation characters.

Punctuation characters are treated as word breaks when they are in unquoted strings. Each run of punctuation characters is treated as a single token.

Parameters:tokens – the tokens as parsed by shlex
Returns:a new list of tokens, further split using punctuation
tokenize(line: str) → List[str]

Lex a string into a list of tokens. Shortcuts and aliases are expanded and comments are removed.

Parameters:line – the command line being lexed
Returns:A list of tokens
Raises:Cmd2ShlexError if a shlex error occurs (e.g. No closing quotation)
class cmd2.Statement(args='', raw='', command='', arg_list=NOTHING, multiline_command='', terminator='', suffix='', pipe_to='', output='', output_to='')

String subclass with additional attributes to store the results of parsing.

The cmd module in the standard library passes commands around as a string. To retain backwards compatibility, cmd2 does the same. However, we need a place to capture the additional output of the command parsing, so we add our own attributes to this subclass.

Instances of this class should not be created by anything other than the cmd2.parsing.StatementParser.parse() method, nor should any of the attributes be modified once the object is created.

The string portion of the class contains the arguments, but not the command, nor the output redirection clauses.

Tips:

  1. argparse is your friend for anything complex. cmd2 has two decorators (with_argparser(), and with_argparser_and_unknown_args()) which you can use to make your command method receive a namespace of parsed arguments, whether positional or denoted with switches.
  2. For commands with simple positional arguments, use args or arg_list
  3. If you don’t want to have to worry about quoted arguments, see argv for a trick which strips quotes off for you.
command

The name of the command after shortcuts and macros have been expanded

args

The arguments to the command as a string with spaces between the words, excluding output redirection and command terminators. If the user used quotes in their input, they remain here, and you will have to handle them on your own.

arg_list

The arguments to the command as a list, excluding output redirection and command terminators. Each argument is represented as an element in the list. Quoted arguments remain quoted. If you want to remove the quotes, use cmd2.utils.strip_quotes() or use argv[1:]

raw

If you want full access to exactly what the user typed at the input prompt you can get it, but you’ll have to parse it on your own, including:

  • shortcuts and aliases
  • quoted commands and arguments
  • output redirection
  • multi-line command terminator handling

If you use multiline commands, all the input will be passed to you in this string, but there will be embedded newlines where the user hit return to continue the command on the next line.

multiline_command

If the command is a multi-line command, the name of the command will be in this attribute. Otherwise, it will be an empty string.

terminator

If the command is a multi-line command, this attribute contains the termination character entered by the user to signal the end of input

suffix

Any characters present between the input terminator and the output redirection tokens.

pipe_to

If the user piped the output to a shell command, this attribute contains the entire shell command as a string. Otherwise it is an empty string.

output

If output was redirected by the user, this contains the redirection token, i.e. >>.

output_to

If output was redirected by the user, this contains the requested destination with quotes preserved.

argv

a list of arguments a-la sys.argv.

The first element of the list is the command after shortcut and macro expansion. Subsequent elements of the list contain any additional arguments, with quotes removed, just like bash would. This is very useful if you are going to use argparse.parse_args().

If you want to strip quotes from the input, you can use argv[1:].

command_and_args

Combine command and args with a space separating them.

Quoted arguments remain quoted. Output redirection and piping are excluded, as are any command terminators.

expanded_command_line

Concatenate command_and_args() and post_command()

post_command

A string containing any ending terminator, suffix, and redirection chars