book_cataloguing module reference
Note
You may notice that with the current layout of the book_cataloguing package, all of these functions are actually defined in the file contents.py. However, this layout is subject to change in future versions of the package; please import functions from book_cataloguing itself rather than book_cataloguing.contents.
Unicode support?
book_cataloguing has some support for non-ASCII characters:
>>> from book_cataloguing import capitalize_title
>>> print(capitalize_title("l'île noire"))
L'Île Noire
However, this support is experimental, and subject to change: please do not rely on it for much. The package does not actually support any language other than English; it probably will not do a good job capitalizing non-English book titles that are more complicated than the one above.
Changing internal lists
In the module book_cataloguing.contents, four lists of strings are created for later use by the functions defined therein. The names of these lists are subject to change, but currently they are:
LOWERCASE_TITLE_WORDS,
LOWERCASE_AUTHOR_WORDS,
MAC_SURNAMES,and
AUTHOR_TITLES.
(More information on each list is available below in the corresponding function.)
Their starting values are (in the developer’s opinion) quite suitable for general use, but a function is provided to change each one, if you choose. In each case, the new strings to put in the list must be in an external file, whose name is passed to the function.
- book_cataloguing.set_lowercase_title_words(filename: str | None = None) None
Get a new list of lowercase words in book titles from a file.
In the file should be words like “the”, “a”, and “of”, that should not be capitalized when they are in the title of a book (unless they are at the beginning or end of a title or subtitle.)
The referenced file should have one word on each line. The case of the words does not matter, and they need not be sorted in any particular order.
If
filenameis None, the default file for this list (book_cataloguing/lowercase_title_words.txt) will be used.
- book_cataloguing.set_lowercase_author_words(filename: str | None = None) None
Get a new list of lowercase words in author names from a file.
In the file should be words like “le”, “von”, and “of”, that should not be capitalized when they are part of an author’s name, and that might be part of a multi-word surname (such as “von Neumann”).
The referenced file should have one word on each line. The case of the words does not matter, and they need not be sorted in any particular order.
If
filenameis None, the default file for this list (book_cataloguing/lowercase_author_words.txt) will be used.
- book_cataloguing.set_mac_surnames(filename: str | None = None) None
Get a new list of surnames starting with “Mac” from a file.
In the file should be names like “MacDonald”, where the fourth letter (the letter following the “Mac”) should be capitalized.
The referenced file should have one word on each line. The case of the words does not matter, and they need not be sorted in any particular order.
If
filenameis None, the default file for this list (book_cataloguing/mac_surnames.txt) will be used.
- book_cataloguing.set_author_titles(filename: str | None = None) None
Get a new list of author titles from a file.
In the file should be words like “lord”, “mrs”, and “president”, that, when they appear in an author’s name, are likely titles rather than part of the name itself.
The referenced file should have one word on each line. The case of the words does not matter, and they need not be sorted in any particular order.
If
filenameis None, the default file for this list (book_cataloguing/author_titles.txt) will be used.
Main Functions
- book_cataloguing.capitalize_title(title: str, handle_mc_prefix: bool = True) str
Capitalize a book title, preserving all non-alphanumeric characters.
This function considers all non-alphanumeric characters except apostrophes to separate words, and it converts all words recognized as Roman numerals to uppercase. It also capitalizes the second letter of words starting with any letter followed by an apostrophe (e.g. O’Brien). See Examples below.
- Parameters:
title (str) – Title to capitalize.
handle_mc_prefix (bool) – Whether or not to treat words starting with “mc” or “mac” differently. When True, capitalize the third letter of all words starting with “mc” (e.g. convert “mcdonald” to “McDonald”), and fourth letter of all words starting with “mac” if they are in the list of Mac surnames. (You can change this list with the function
set_mac_surnames().) These prefixes are detected case-insensitively. When False, capitalize only the first letter of such names.
- Returns:
Capitalized version of title.
- Return type:
Examples
>>> capitalize_title("the hobbit: or, there and back again") 'The Hobbit: Or, There and Back Again' >>> capitalize_title(" THE*LORD =of tHE RIngs]") ' The*Lord =of the Rings]' >>> capitalize_title("the thirteen-gun salute") 'The Thirteen-Gun Salute' >>> capitalize_title("a midsummer night's dream") "A Midsummer Night's Dream"
Handling of Roman numerals:
>>> capitalize_title("henry vi, part ii") 'Henry VI, Part II'
Handling of name prefixes:
>>> capitalize_title("A BIOGRAPHY OF GEORGE MACDONALD") 'A Biography of George MacDonald' >>> capitalize_title("a biography of george macdonald", False) 'A Biography of George Macdonald' >>> capitalize_title("a biography of patrick o'brien") "A Biography of Patrick O'Brien"
- book_cataloguing.capitalize_author(author: str, handle_mc_prefix: bool = True) str
Capitalize the name of an author, preserving non-alphanumeric characters.
This function considers all non-alphanumeric characters except apostrophes to separate words, and it converts all words recognized as Roman numerals to uppercase. It also capitalizes the second letter of words starting with any letter followed by an apostrophe (e.g. O’Brien). See Examples below.
- Parameters:
author (str) – Author name to capitalize.
handle_mc_prefix (bool) – Whether or not to treat words starting with “mc” or “mac” differently. When True, capitalize the third letter of all words starting with “mc” (e.g. convert “mcdonald” to “McDonald”), and fourth letter of all words starting with “mac” if they are in the list of Mac surnames. (You can change this list with the function
set_mac_surnames().) These prefixes are detected case-insensitively. When False, capitalize only the first letter of such names.
- Returns:
Capitalized version of author name.
- Return type:
- book_cataloguing.get_sortable_title(title: str, handle_mc_prefix: bool = True, correct_case: bool = True, smart_numbers: bool = True) str
Return a representation of the title that is usable for sorting.
This involves removing the first word of the title if it is “a”, “an”, or “the”, and removing non-alphanumeric characters as well.
From this function’s point of view, a word separator is any combination of non-alphanumeric characters that contains a space. See Examples below.
- Parameters:
title (str) – Title to return sortable version of.
handle_mc_prefix (bool) – If
correct_caseis True (see below), then pass this parameter as a keyword argument with the same name in the call tocapitalize_title(). Default True.correct_case (bool) – If True, capitalize the title with the function
capitalize_title()before returning it. If False, return the title in all lowercase. Default True.smart_numbers (bool) – If True, convert all Arabic numerals in the title to their written-out equivalents. See Number Handling below. Default True.
- Returns:
Sortable version of title, with no leading “a”, “an”, or “the”.
- Return type:
Number Handling
When the parameter
smart_numbersisTrue(the default), all words in the title made entirely of ASCII numerals will be converted to their written-out equivalents. Comma-separated numbers will also be converted as if the commas were not present (e.g. “30,000” to “thirty thousand”, “1,2” to “twelve”). If a word begins with a numeral but contains letters as well, the entire word will be replaced with the ordinal form of the number which begins it. Thus “1st” will be replaced with “first”, and “21st”, “21nd”, and “21st0” will all be replaced with “twenty-first”.Examples
>>> get_sortable_title("an episode of sparrows") 'Episode of Sparrows' >>> get_sortable_title(" `the +Hob.bit") 'Hobbit' >>> get_sortable_title("MOSTLY H-ARMLESS)") 'Mostly Harmless'
When
correct_caseis False:>>> get_sortable_title("an episode of sparrows", correct_case=False) 'episode of sparrows' >>> get_sortable_title(" `the +Hob.bit", correct_case=False) 'hobbit' >>> get_sortable_title("MOSTLY H-ARMLESS)", correct_case=False) 'mostly harmless'
With numbers in the title:
>>> get_sortable_title("20,000 leagues under the sea") 'Twenty Thousand Leagues Under the Sea' >>> get_sortable_title("Around the World in 8,0 Days", correct_case=False) 'around the world in eighty days' >>> get_sortable_title("the 1st 2 lives of lukas-kasha") 'First Two Lives of Lukas-Kasha' >>> # Commas within numbers will be removed even if smart_numbers == False, >>> # as they are non-alphanumeric >>> get_sortable_title("20,000 leagues under the sea", smart_numbers=False) '20000 Leagues Under the Sea'
- book_cataloguing.get_sortable_author(author: str, handle_mc_prefix: bool = True, correct_case: bool = True) str
Return author’s name in the format “last, first”.
This function considers all non-alphanumeric characters except apostrophes to separate words. It also places periods after one-letter words (assuming them to be initials), and it removes all non-alphanumeric characters in the result except for:
These periods,
All hyphens and apostrophes,
The comma separating the first and last names, and
The period after “jr” or “sr”, if applicable.
By default, the author’s surname is assumed to be one word long. However, if the last part of the name is a Roman numeral, “jr”, or “sr”, it is assumed to be part of the surname. Also, if the surname is prefixed with a word in the list of lowercase author words (such as “le” or “von”), that word is assumed to be part of the surname. (You may change this list with the function
set_lowercase_author_words().) See Examples below.According to the Anglo-American Cataloguing Rules, authors whose names begin with “mc” should be alphabetized as if their names start with “mac”. This function replaces the prefix “mc” in this way to make that rule easier to follow; again, please see Examples.
Lastly, this function removes from the given name words such as “lord” and “mr” that are in the list of author titles. (You may change this list with the function
set_author_titles().)- Parameters:
author (str) – Author name to return in “last, first” format.
handle_mc_prefix (bool) – If
correct_caseis True (see below), then pass this parameter as a keyword argument with the same name in the call tocapitalize_author(). Default True. Please note that this parameter does not change whether or not the “mc” prefix is replaced with “mac” as mentioned above; this behavior cannot be disabled. It only controls the capitalization of such prefixes.correct_case (bool) – If True, capitalize the author’s name with the function
capitalize_author()before returning it. If False, return the name in all lowercase. Default True.
- Returns:
Author’s name in “last, first” format.
- Return type:
- book_cataloguing.title_sort(iterable: Iterator[Any], /, *, key: Callable[[Any], str] | None = None, reverse: bool = False, smart_numbers: bool = True) list[Any]
Sort the given objects as if they are book titles.
- Parameters:
iterable (Iterator[Any]) – Iterator of objects to sort.
key (Optional[Callable[[Any], str]]) – Function with which to extract a comparison key from each item from the iterable. Default is None (items are compared directly).
reverse (bool) – Whether or not to reverse the sorted order, making it descending instead of ascending. Default False.
smart_numbers (bool) – This parameter is supplied as a keyword argument with the same name in the calls to
get_sortable_title().
- Returns:
Sorted list of given objects.
- Return type:
list[Any]
The given titles are not sorted as they are; instead the return values of a call to
get_sortable_title()for each given object are sorted. Thus, please see the documentation for that function for more details on the sorting. The calls have thecorrect_caseargument set toFalse, so comparisons are case-insensitive.
- book_cataloguing.author_sort(iterable: Iterator[Any], /, *, key: Callable[[Any], str] | None = None, reverse: bool = False) list[Any]
Sort the given objects as if they are the authors of books.
- Parameters:
iterable (Iterator[Any]) – Iterator of objects to sort.
key (Optional[Callable[[Any], str]]) – Function with which to extract a comparison key from each item from the iterable. Default is None (items are compared directly).
reverse (bool) – Whether or not to reverse the sorted order, making it descending instead of ascending. Default False.
- Returns:
Sorted list of given objects.
- Return type:
list[Any]
The given authors are sorted case-insensitively: first by last name, and then by first name. The last and first names used by this function correspond exactly to those determined by
get_sortable_author(), and put before and after the comma by that function. Thus, please see its documentation for details on the sorting.