diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..cc29df8 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +src/__pycache__ \ No newline at end of file diff --git a/README.md b/README.md index 594163d..b69ce99 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@ -https://user-images.githubusercontent.com/113618658/214463785-49c419e9-c959-4849-91d2-f3407ecaa73d.mp4 + # Musort -Ogranize your music library. A Python3 program that renames all selected music/audio files in a folder with a specified naming convention. Names are generated from the metadata (ID3) from the audio files. Before using this program, use a metadata editor like MusicBrainz Picard, Beets or EasyTAG to add the correct metadata to the audio files. +Musort: Effortlessly organize your music library with this Python3 program. Rename selected music/audio files in a folder using a customizable naming convention based on metadata (ID3) from the audio files. Ensure accurate metadata by using popular tools like MusicBrainz Picard, Beets, or EasyTAG before running Musort. Simplify your music organization and enhance file names for a more enjoyable library experience. ## Features * Rename many audio files at once -* Rename all files in subdirectories as well (recursive) +* Rename audio files in subdirectories as well (recursive) * Choose the naming convention (ex. track.title.flac or artist.track.year.mp3) * Give a separator for the naming of the file (ex. track.title.flac or track_title.flac) * Works on all systems that can run Python @@ -21,13 +21,17 @@ Ogranize your music library. A Python3 program that renames all selected music/a * AIFF/AIFF-C ## Dependencies -**Note: When using the install script, TinyTag will be automatically installed** +**Make sure to install these programs to be able to run Musort** - [Python3](https://www.python.org/) -- [TinyTag](https://pypi.org/project/tinytag/) (Installable from Python Package Index) +- [Docker (Optional)](https://docker.com) + +The Python3 library TinyTag is also used, but is already included in this repository. Therefore, there is no need to install TinyTag for only this project. + ## Installation and Usage ### Method 1: Run installation script (Unix/Linux based OS only) -The installation script will move the python program to `~/.local/bin`. Make sure that `~/.local/bin` exists and that is added to $PATH. + +The installation script will move the python program to `~/.local/bin`. The installation directory can be changed in the `install.sh` script. **Note: The installation directory should be added to $PATH** ``` Bash git clone https://github.com/tdeerenberg/Musort.git cd Musort.git @@ -35,7 +39,10 @@ chmod +x install.sh ./install.sh ``` After that, simply use the command `musort` to use the program. +
+ ### Method 2: Clone repo and run manually (All Operating Systems) + Clone the repository and run the Python program ``` Bash git clone https://github.com/tdeerenberg/Musort.git @@ -43,32 +50,42 @@ cd Musort pip install requirements.txt ``` After that, run the program with `python3 musort.py`. +
### Method 3: Docker installation + ``` Bash git clone https://github.com/tdeerenberg/Musort.git cd Musort docker build -t musort . ``` - After the docker installation is complete, musort can be run with: `docker run --name musort --rm -v "/:/HostMountedFS" -it musort` - -> Tip: You could alias something like `alias musortd="docker run --name musort --rm -v "/:/HostMountedFS" -it musort"` then use `musortd` juse like `musort` usage is explained above +After the Docker installation/build is complete, Musort can be run with: -## Manual (options and arguments) `musort --help` +`docker run --name musort --rm -v "[music_directory_host]:[music_directory_container]" -it musort [music_directory_container]` + +The music folder must be mounted to the Docker container, therefore the `-v` option must be used to mount the directory. + +An example of running Musort in Docker, using `/home/user/music` as music folder: + +`docker run --name musort --rm -v '/home/user/music:/music' -it musort /music` + +## Manual with options and arguments (`musort --help`) ``` USAGE: -musort [DIRECTORY] [NAMING_CONVENTION] [OPTIONAL_OPTIONS]... +musort [DIRECTORY] [OPTIONAL_PARAMETERS] USAGE EXAMPLES: - musort ~/music track.title.year -s _ -r - musort /local/music disc.artist.title.album -r - musort ~/my_music track.title + musort ~/music + musort /local/music -f disc.artist.title.album -r + musort ~/my_music -s _ -r OPTIONAL OPTIONS: -h, --help Show the help menu +-f, --format set the naming convention (see 'NAMING CONVENTION:' below) -s, --separator Set the separator for the filename (ex. '-s .' -> 01.track.flac and '-s -' -> 01-track.mp3) - Default separator ( . ) will be used if none is given + Default separator '_' will be used if none is given -r, --recursive Rename files in subdirectories as well -v, --version Prints the version number + NAMING CONVENTION: FORMAT_OPTION.FORMAT_OPTION... The amount of format options does not matter. It can be one, two, three, even all of them. @@ -96,13 +113,13 @@ year year or date as string ## Possible features to add * Rename single file -* Other installation methods (Like AUR, Docker, etc.) +* Other installation methods (e.g. via AUR) * Open for suggestions! * Feel free to open a pull request or issue! ## Authors -- [@tdeerenberg](https://www.github.com/tdeerenberg) +- [@tdeerenberg](https://github.com/tdeerenberg) ## License diff --git a/dockerfile b/dockerfile index 18100bb..3612f1b 100644 --- a/dockerfile +++ b/dockerfile @@ -1,17 +1,39 @@ -FROM python:3.10-slim +# Musort - Docker installation +# Copyright (C) 2023 tdeerenberg +# +# Sources on github: +# https://github.com/tdeerenberg/Musort +# +# Licensed under the GNU General Public License v3.0 (GPLv3) +# Copyright (C) 2023 tdeerenberg +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . -# Probably should do update/upgrade for any CVEs +FROM python:3.11 + +# Update and upgrade packages +RUN apt-get update && apt-get upgrade -y + +# Set working directory WORKDIR / -# Set up requirements -COPY requirements.txt requirements.txt -# hadolint ignore=DL3013 -RUN python3 -m pip install pip --upgrade --no-cache-dir && \ - python3 -m pip install -r requirements.txt --no-cache-dir -# Copy in code -COPY ./src/musort-docker.py /musort.py +# Copy project +COPY ./src/config.py /config.py +COPY ./src/variables.py /variables.py +COPY ./src/tinytag.py /tinytag.py +COPY ./src/musort.py /musort.py # docker run --name musort --rm -it musort --help -ENTRYPOINT ["python3", "/musort.py"] - +ENTRYPOINT ["python3", "/musort.py"] \ No newline at end of file diff --git a/install.sh b/install.sh index 496112f..6e4579b 100755 --- a/install.sh +++ b/install.sh @@ -1,5 +1,55 @@ #!/bin/bash -# Make sure to add ~/.local/bin to $PATH -pip3 install -r requirements.txt -cp src/musort.py ~/.local/bin/musort -echo "IF NOT DONE ALREADY, ADD '~/.local/bin' TO $ PATH" \ No newline at end of file +# +# Musort - Install and Upgrade Script +# Copyright (C) 2023 tdeerenberg +# +# Sources on github: +# https://github.com/tdeerenberg/Musort +# +# Licensed under the GNU General Public License v3.0 (GPLv3) +# Copyright (C) 2023 tdeerenberg +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Define paths +install_path=~/.local/bin +script_name=musort +script_path=src/musort.py +script_config=src/config.py +script_tinytag=src/tinytag.py +script_variables=src/variables.py + +# Check if script already exists +if [ -e "$install_path/$script_name" ]; then + read -p "Musort is already installed. Do you want to overwrite it with the new version? (y/n): " response + if [[ $response =~ ^[Yy]$ ]]; then + echo "Updating Musort..." + cp "$script_path" "$install_path/$script_name" + cp "$script_config" "$install_path/" + cp "$script_tinytag" "$install_path/" + cp "$script_variables" "$install_path/" + echo "Musort updated successfully!" + else + echo "Upgrade canceled. Musort remains unchanged." + fi +else + # Install Musort + cp "$script_path" "$install_path/$script_name" + cp "$script_config" "$install_path/" + cp "$script_tinytag" "$install_path/" + cp "$script_variables" "$install_path/" + echo "Musort installed successfully!" +fi + +echo "If not done already, add '$install_path' to \$PATH" \ No newline at end of file diff --git a/src/musort-docker.py b/musort-docker.py similarity index 100% rename from src/musort-docker.py rename to musort-docker.py diff --git a/requirements.txt b/requirements.txt deleted file mode 100644 index 74789ed..0000000 --- a/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -TinyTag \ No newline at end of file diff --git a/src/config.py b/src/config.py new file mode 100644 index 0000000..d72f4cd --- /dev/null +++ b/src/config.py @@ -0,0 +1,43 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# +# Musort - A command-line tool for effortlessly organizing and renaming your music files based on metadata +# Copyright (C) 2023 tdeerenberg +# +# Sources on github: +# https://github.com/tdeerenberg/Musort +# +# Licensed under the GNU General Public License v3.0 (GPLv3) +# Copyright (C) 2023 tdeerenberg +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + + +# +# These are the default settings of Musort. +# Default settings are used when no parameters are given. +# The default settings may be changed to your liking. +# + +# Change separator between the naming format +separator = "." + +# Toggle recursively renaming though subdirectories +recursive = False + +# Format for renaming music files +name_format = "track.title" + +# Replacement for illegal characters +forbidden_char_replace = "_" \ No newline at end of file diff --git a/src/musort.py b/src/musort.py old mode 100755 new mode 100644 index 44ebcb3..7b4a36a --- a/src/musort.py +++ b/src/musort.py @@ -1,187 +1,151 @@ #!/usr/bin/env python - -# Licensed under GPLv3 +# +# Musort - A command-line tool for effortlessly organizing and renaming your music files based on metadata # Copyright (C) 2023 tdeerenberg +# +# Sources on github: +# https://github.com/tdeerenberg/Musort +# +# Licensed under the GNU General Public License v3.0 (GPLv3) +# Copyright (C) 2023 tdeerenberg +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . -from tinytag import TinyTag import os -import sys, getopt +import sys +import getopt import logging +from config import * +from variables import * +from tinytag import TinyTag -supported_formats = ["flac", "mp3", "mp2", "mp1", "opus", "ogg", "wma"] +class Musort: + def __init__(self): + """Initial setup""" + self.recursive = recursive + self.separator = separator + self.format = name_format.split('.') + self.replace = forbidden_char_replace + self.directory = None + self.files = [] + logging.basicConfig(level=logging.DEBUG, format='[%(levelname)s] %(asctime)s - %(message)s') -version = "Musort v0.2 (c) tdeerenberg" -help=\ -"""Musort (c) 2023 tdeerenberg (github.com/tdeerenberg) + def display_settings(self): + """Displays the settings""" + logging.info(f"Recursive renaming: '{self.recursive}'.") + logging.info(f"Separator: '{self.separator}'.") + logging.info(f"Format: {self.format}.") -DESCRIPTION: -A Python3 program that renames all selected music/audio files in a folder with a specified naming convention + def get_files(self): + """Check files from the given directory""" + if self.recursive: + self.files = [ + os.path.join(root, filename) + for root, _, files in os.walk(self.directory) + for filename in files + if self.is_compatible(filename) + ] + else: + self.files = [ + os.path.join(os.path.abspath(self.directory), filename) + for filename in os.listdir(self.directory) + if self.is_compatible(filename) + ] -USAGE: -musort [DIRECTORY] [NAMING_CONVENTION] [OPTIONAL_OPTIONS]... + def is_compatible(self, filename): + """Check if a file is one of the compatible music files based on the extension""" + file_extension = filename.split(".")[-1].lower() + return file_extension in supported_formats - USAGE EXAMPLES: - musort ~/music track.title.year -s _ -r - musort /local/music disc.artist.title.album -r - musort ~/my_music track.title - -OPTIONAL OPTIONS: --h, --help Show the help menu --s, --separator Set the separator for the filename (ex. '-s .' -> 01.track.flac and '-s -' -> 01-track.mp3) - Default separator ( . ) will be used if none is given --r, --recursive Rename files in subdirectories as well --v, --version Prints the version number - -NAMING CONVENTION: -FORMAT_OPTION.FORMAT_OPTION... The amount of format options does not matter. - It can be one, two, three, even all of them. - (See FORMAT OPTIONS below for all options) - -FORMAT OPTIONS: -album album as string -albumartist album artist as string -artist artist name as string -audio_offset number of bytes before audio data begins -bitdepth bit depth for lossless audio -bitrate bitrate in kBits/s -comment file comment as string -composer composer as string -disc disc number -disc_total the total number of discs -duration duration of the song in seconds -filesize file size in bytes -genre genre as string -samplerate samples per second -title title of the song -track track number as string -track_total total number of tracks as string -year year or date as string - -SUPPORTED AUDIO FORMATS: -MP3/MP2/MP1 (ID3 v1, v1.1, v2.2, v2.3+) -Wave/RIFF -OGG -OPUS -FLAC -WMA -MP4/M4A/M4B/M4R/M4V/ALAC/AAX/AAXC""" - -class Music: - def get_files(self, directory): - """Scans the set directory for compatible audio files""" - self.files = list(map(lambda x: os.path.join(os.path.abspath(directory), x),os.listdir(directory))) - - def get_files_recursive(self, directory): - """Scans the set directory with subdirectory for compatible audio files""" - files = [] - for a, b, c in os.walk(directory): - for d in c: - files.append(os.path.join(a,d)) - self.files = files - - def get_compatible(self): - music = [] - for file in self.files: - file_extension = file.split(".")[-1] - - if file_extension in supported_formats: - music.append(file) - - self.compatible = music - - def set_separator(self, sep): - """Sets the separator for naming the audio files - (ex. 01-songname.mp3 or 01.songname.flac)""" - if sep in ['\\', '/', '|', '*', '<', '>', '"', '?']: - sep = "_" - logging.warning("Given separator contains invalid filename symbols, defaulting to .") - self.separator = sep - - def set_format(self, val): - """Sets the naming convention of the audio files - (ex. title-artist or artist-track-title)""" - self.format = val.split(".") - - # Rename files def rename_music(self): - """Rename all compatible music files""" + """Rename all provided music""" + for file in self.files: + # Get the file extension + filename, extension = os.path.splitext(file) - """Get the file extension (ex. .flac, .mp3, etc)""" - for file in self.compatible: - ext = file.split(".") - ext = "." + ext[-1] - - """Let TinyTag module read the audio file""" + # Read metadata track = TinyTag.get(file) - """Print the progress (Current track)""" - logging.info(f"Current track: '{track.artist}' - '{track.title}'") + # Show progress + logging.info(f"Renaming track: '{track.artist}' - '{track.title}'.") + + # Use given format to set a new filename rename = [] - - """Uses the given format to set new filename""" - for f in self.format: - - if f == "track": + for metadata_field in self.format: + if metadata_field == "track": rename.append(f"{int(track.track):02}") else: - """getattr gets attribute in track with name f""" - rename.append(getattr(track, f)) - + rename.append(getattr(track, metadata_field)) rename.append(self.separator) rename.pop() - rename = ''.join(rename)+ext - """Replacing forbidden path characters in UNIX and Windows with underscores""" - for forbidden_character in ['\\', '/', '|', '*', '<', '>', '"', '?']: - if forbidden_character in rename: - logging.warning(f"Track contains forbidden path character ({forbidden_character}) in the new file name, replaced symbol with _") - rename = rename.replace(forbidden_character, "_") - """Get the absolute path and rename the audio file""" - dst = os.path.join(os.path.abspath(os.path.dirname(file)), rename) - os.rename(file, dst) - logging.info("Actions finished") + # Replace forbidden characters + rename = ''.join(rename) + for char in invalid_characters: + rename = rename.replace(char, self.replace) -def main(): - level = logging.DEBUG - logging.basicConfig(level=level, format='[%(levelname)s] %(asctime)s - %(message)s') - """Runs the whole program""" - argv = sys.argv[3:] + # Get absolute path and rename the audio file + new_path = os.path.join(os.path.abspath(os.path.dirname(file)), rename + extension) + os.rename(file, new_path) + + logging.info(f"Track: '{track.artist}' - '{track.title}' contained an illegal character; the character has been replaced with: '{self.replace}'.") + logging.info("Renaming finished.") + +def parse_args(argv, m_class): + """Parse command line arguments""" try: - opts, args = getopt.getopt(argv, "s:r", ["sep=", "recursive="]) - except getopt.GetoptError as err: - logging.error(err) + opts, args = getopt.getopt(argv[2:], "s:rf:", ["separator=", "recursive", "format="]) + except getopt.GetoptError as error_mesg: + logging.error(error_mesg) exit() - music = Music() + # Handle command line arguments for opt, arg in opts: if opt in ['-s', '--separator']: - logging.info(f"Using {arg} as separator") - music.set_separator(arg) - if opt in ['-r', '--recursive']: - logging.info("Running recursively") - music.get_files_recursive(sys.argv[1]) - music.get_compatible() + m_class.separator = arg if check_separator(arg) else default_separator + elif opt in ['-r', '--recursive']: + m_class.recursive = True + elif opt in ['-f', '--format']: + m_class.format = arg.split(".") - if sys.argv[1] == "-h" or sys.argv[1] == '--help': - print(help) + # Handle help and version options + if '-h' in sys.argv or '--help' in sys.argv: + print(help_text) exit() - if sys.argv[1] == '-v' or sys.argv[1] == '--version': - print(version) + elif '-v' in sys.argv or '--version' in sys.argv: + print(version_text) exit() - try: - music.compatible - except: - logging.info("Running not recursively") - music.get_files(sys.argv[1]) - music.get_compatible() - try: - music.separator - except: - logging.info("Using default separator") - music.set_separator(".") - music.set_format(sys.argv[2]) - music.rename_music() + # Set directory + m_class.directory = sys.argv[1] + + if m_class.directory is None: + logging.error("Please provide a music directory.") + exit() + +def check_separator(sep): + if any(char in separator for char in invalid_characters): + logging.warning(f"Given separator contains invalid filename symbols, defaulting to '{separator}'.\n") + return False + return True + +def main(): + m_class = Musort() + parse_args(sys.argv, m_class) + m_class.get_files() + m_class.display_settings() + m_class.rename_music() if __name__ == "__main__": main() diff --git a/src/tinytag.py b/src/tinytag.py new file mode 100644 index 0000000..e9f6f8e --- /dev/null +++ b/src/tinytag.py @@ -0,0 +1,1394 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- + +# tinytag - an audio meta info reader +# Copyright (c) 2014-2022 Tom Wallroth +# +# Sources on github: +# http://github.com/devsnd/tinytag/ + +# MIT License + +# Copyright (c) 2014-2022 Tom Wallroth + +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: + +# The above copyright notice and this permission notice shall be included in all +# copies or substantial portions of the Software. + +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. + + +from __future__ import division, print_function +from collections import OrderedDict, defaultdict +try: + from collections.abc import MutableMapping +except ImportError: + from collections import MutableMapping +from functools import reduce +from io import BytesIO +import base64 +import codecs +import io +import json +import operator +import os +import re +import struct +import sys + +DEBUG = os.environ.get('DEBUG', False) # some of the parsers can print debug info + + +class TinyTagException(LookupError): # inherit LookupError for backwards compat + pass + + +def _read(fh, nbytes): # helper function to check if we haven't reached EOF + b = fh.read(nbytes) + if len(b) < nbytes: + raise TinyTagException('Unexpected end of file') + return b + + +def stderr(*args): + sys.stderr.write('%s\n' % ' '.join(repr(arg) for arg in args)) + sys.stderr.flush() + + +def _bytes_to_int_le(b): + fmt = {1: ' 0: + return TinyTag(None, 0) + with io.open(filename, 'rb') as af: + parser_class = cls.get_parser_class(filename, af) + tag = parser_class(af, size, ignore_errors=ignore_errors) + tag._filename = filename + tag._default_encoding = encoding + tag.load(tags=tags, duration=duration, image=image) + tag.extra = dict(tag.extra) # turn default dict into dict so that it can throw KeyError + return tag + + def __str__(self): + return json.dumps(OrderedDict(sorted(self.as_dict().items()))) + + def __repr__(self): + return str(self) + + def load(self, tags, duration, image=False): + self._parse_tags = tags + self._load_image = image + if tags: + self._parse_tag(self._filehandler) + if duration: + if tags: # rewind file if the tags were already parsed + self._filehandler.seek(0) + self._determine_duration(self._filehandler) + + def _set_field(self, fieldname, value, overwrite=True): + """convenience function to set fields of the tinytag by name""" + write_dest = self # write into the TinyTag by default + get_func = getattr + set_func = setattr + is_extra = fieldname.startswith('extra.') # but if it's marked as extra field + if is_extra: + fieldname = fieldname[6:] + write_dest = self.extra # write into the extra field instead + get_func = operator.getitem + set_func = operator.setitem + if get_func(write_dest, fieldname): # do not overwrite existing data + return + if DEBUG: + stderr('Setting field "%s" to "%s"' % (fieldname, value)) + if fieldname == 'genre': + genre_id = 255 + if value.isdigit(): # funky: id3v1 genre hidden in a id3v2 field + genre_id = int(value) + else: # funkier: the TCO may contain genres in parens, e.g. '(13)' + if value[:1] == '(' and value[-1:] == ')' and value[1:-1].isdigit(): + genre_id = int(value[1:-1]) + if 0 <= genre_id < len(ID3.ID3V1_GENRES): + value = ID3.ID3V1_GENRES[genre_id] + if fieldname in ("track", "disc", "track_total", "disc_total"): + # Converting to string for type consistency + value = str(value) + mapping = [(fieldname, value)] + if fieldname in ("track", "disc"): + if type(value).__name__ in ('str', 'unicode') and '/' in value: + value, total = value.split('/')[:2] + mapping = [(fieldname, str(value)), ("%s_total" % fieldname, str(total))] + for k, v in mapping: + if overwrite or not get_func(write_dest, k): + set_func(write_dest, k, v) + + def _determine_duration(self, fh): + raise NotImplementedError() + + def _parse_tag(self, fh): + raise NotImplementedError() + + def update(self, other): + # update the values of this tag with the values from another tag + for key in ['track', 'track_total', 'title', 'artist', + 'album', 'albumartist', 'year', 'duration', + 'genre', 'disc', 'disc_total', 'comment', 'composer']: + if not getattr(self, key) and getattr(other, key): + setattr(self, key, getattr(other, key)) + + @staticmethod + def _unpad(s): + # strings in mp3 and asf *may* be terminated with a zero byte at the end + return s.replace('\x00', '') + + +class MP4(TinyTag): + # https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/Metadata/Metadata.html + # https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html + + class Parser: + # https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/Metadata/Metadata.html#//apple_ref/doc/uid/TP40000939-CH1-SW34 + ATOM_DECODER_BY_TYPE = { + 0: lambda x: x, # 'reserved', + 1: lambda x: codecs.decode(x, 'utf-8', 'replace'), # UTF-8 + 2: lambda x: codecs.decode(x, 'utf-16', 'replace'), # UTF-16 + 3: lambda x: codecs.decode(x, 's/jis', 'replace'), # S/JIS + # 16: duration in millis + 13: lambda x: x, # JPEG + 14: lambda x: x, # PNG + 21: lambda x: struct.unpack('>b', x)[0], # BE Signed int + 22: lambda x: struct.unpack('>B', x)[0], # BE Unsigned int + 23: lambda x: struct.unpack('>f', x)[0], # BE Float32 + 24: lambda x: struct.unpack('>d', x)[0], # BE Float64 + # 27: lambda x: x, # BMP + # 28: lambda x: x, # QuickTime Metadata atom + 65: lambda x: struct.unpack('b', x)[0], # 8-bit Signed int + 66: lambda x: struct.unpack('>h', x)[0], # BE 16-bit Signed int + 67: lambda x: struct.unpack('>i', x)[0], # BE 32-bit Signed int + 74: lambda x: struct.unpack('>q', x)[0], # BE 64-bit Signed int + 75: lambda x: struct.unpack('B', x)[0], # 8-bit Unsigned int + 76: lambda x: struct.unpack('>H', x)[0], # BE 16-bit Unsigned int + 77: lambda x: struct.unpack('>I', x)[0], # BE 32-bit Unsigned int + 78: lambda x: struct.unpack('>Q', x)[0], # BE 64-bit Unsigned int + } + + @classmethod + def make_data_atom_parser(cls, fieldname): + def parse_data_atom(data_atom): + data_type = struct.unpack('>I', data_atom[:4])[0] + conversion = cls.ATOM_DECODER_BY_TYPE.get(data_type) + if conversion is None: + stderr('Cannot convert data type: %s' % data_type) + return {} # don't know how to convert data atom + # skip header & null-bytes, convert rest + return {fieldname: conversion(data_atom[8:])} + return parse_data_atom + + @classmethod + def make_number_parser(cls, fieldname1, fieldname2): + def _(data_atom): + number_data = data_atom[8:14] + numbers = struct.unpack('>HHH', number_data) + # for some reason the first number is always irrelevant. + return {fieldname1: numbers[1], fieldname2: numbers[2]} + return _ + + @classmethod + def parse_id3v1_genre(cls, data_atom): + # dunno why the genre is offset by -1 but that's how mutagen does it + idx = struct.unpack('>H', data_atom[8:])[0] - 1 + if idx < len(ID3.ID3V1_GENRES): + return {'genre': ID3.ID3V1_GENRES[idx]} + return {'genre': None} + + @classmethod + def read_extended_descriptor(cls, esds_atom): + for i in range(4): + if esds_atom.read(1) != b'\x80': + break + + @classmethod + def parse_audio_sample_entry_mp4a(cls, data): + # this atom also contains the esds atom: + # https://ffmpeg.org/doxygen/0.6/mov_8c-source.html + # http://xhelmboyx.tripod.com/formats/mp4-layout.txt + # http://sasperger.tistory.com/103 + datafh = BytesIO(data) + datafh.seek(16, os.SEEK_CUR) # jump over version and flags + channels = struct.unpack('>H', datafh.read(2))[0] + datafh.seek(2, os.SEEK_CUR) # jump over bit_depth + datafh.seek(2, os.SEEK_CUR) # jump over QT compr id & pkt size + sr = struct.unpack('>I', datafh.read(4))[0] + + # ES Description Atom + esds_atom_size = struct.unpack('>I', data[28:32])[0] + esds_atom = BytesIO(data[36:36 + esds_atom_size]) + esds_atom.seek(5, os.SEEK_CUR) # jump over version, flags and tag + + # ES Descriptor + cls.read_extended_descriptor(esds_atom) + esds_atom.seek(4, os.SEEK_CUR) # jump over ES id, flags and tag + + # Decoder Config Descriptor + cls.read_extended_descriptor(esds_atom) + esds_atom.seek(9, os.SEEK_CUR) + avg_br = struct.unpack('>I', esds_atom.read(4))[0] / 1000 # kbit/s + return {'channels': channels, 'samplerate': sr, 'bitrate': avg_br} + + @classmethod + def parse_audio_sample_entry_alac(cls, data): + # https://github.com/macosforge/alac/blob/master/ALACMagicCookieDescription.txt + alac_atom_size = struct.unpack('>I', data[28:32])[0] + alac_atom = BytesIO(data[36:36 + alac_atom_size]) + alac_atom.seek(9, os.SEEK_CUR) + bitdepth = struct.unpack('b', alac_atom.read(1))[0] + alac_atom.seek(3, os.SEEK_CUR) + channels = struct.unpack('b', alac_atom.read(1))[0] + alac_atom.seek(6, os.SEEK_CUR) + avg_br = struct.unpack('>I', alac_atom.read(4))[0] / 1000 # kbit/s + sr = struct.unpack('>I', alac_atom.read(4))[0] + return {'channels': channels, 'samplerate': sr, 'bitrate': avg_br, 'bitdepth': bitdepth} + + @classmethod + def parse_mvhd(cls, data): + # http://stackoverflow.com/a/3639993/1191373 + walker = BytesIO(data) + version = struct.unpack('b', walker.read(1))[0] + walker.seek(3, os.SEEK_CUR) # jump over flags + if version == 0: # uses 32 bit integers for timestamps + walker.seek(8, os.SEEK_CUR) # jump over create & mod times + time_scale = struct.unpack('>I', walker.read(4))[0] + duration = struct.unpack('>I', walker.read(4))[0] + else: # version == 1: # uses 64 bit integers for timestamps + walker.seek(16, os.SEEK_CUR) # jump over create & mod times + time_scale = struct.unpack('>I', walker.read(4))[0] + duration = struct.unpack('>q', walker.read(8))[0] + return {'duration': duration / time_scale} + + @classmethod + def debug_atom(cls, data): + stderr(data) # use this function to inspect atoms in an atom tree + return {} + + # The parser tree: Each key is an atom name which is traversed if existing. + # Leaves of the parser tree are callables which receive the atom data. + # callables return {fieldname: value} which is updates the TinyTag. + META_DATA_TREE = {b'moov': {b'udta': {b'meta': {b'ilst': { + # see: http://atomicparsley.sourceforge.net/mpeg-4files.html + # and: https://metacpan.org/dist/Image-ExifTool/source/lib/Image/ExifTool/QuickTime.pm#L3093 + b'\xa9ART': {b'data': Parser.make_data_atom_parser('artist')}, + b'\xa9alb': {b'data': Parser.make_data_atom_parser('album')}, + b'\xa9cmt': {b'data': Parser.make_data_atom_parser('comment')}, + # need test-data for this + # b'cpil': {b'data': Parser.make_data_atom_parser('extra.compilation')}, + b'\xa9day': {b'data': Parser.make_data_atom_parser('year')}, + b'\xa9des': {b'data': Parser.make_data_atom_parser('extra.description')}, + b'\xa9gen': {b'data': Parser.make_data_atom_parser('genre')}, + b'\xa9lyr': {b'data': Parser.make_data_atom_parser('extra.lyrics')}, + b'\xa9mvn': {b'data': Parser.make_data_atom_parser('movement')}, + b'\xa9nam': {b'data': Parser.make_data_atom_parser('title')}, + b'\xa9wrt': {b'data': Parser.make_data_atom_parser('composer')}, + b'aART': {b'data': Parser.make_data_atom_parser('albumartist')}, + b'cprt': {b'data': Parser.make_data_atom_parser('extra.copyright')}, + b'desc': {b'data': Parser.make_data_atom_parser('extra.description')}, + b'disk': {b'data': Parser.make_number_parser('disc', 'disc_total')}, + b'gnre': {b'data': Parser.parse_id3v1_genre}, + b'trkn': {b'data': Parser.make_number_parser('track', 'track_total')}, + # need test-data for this + # b'tmpo': {b'data': Parser.make_data_atom_parser('extra.bmp')}, + }}}}} + + # see: https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap3/qtff3.html + AUDIO_DATA_TREE = { + b'moov': { + b'mvhd': Parser.parse_mvhd, + b'trak': {b'mdia': {b"minf": {b"stbl": {b"stsd": { + b'mp4a': Parser.parse_audio_sample_entry_mp4a, + b'alac': Parser.parse_audio_sample_entry_alac + }}}}} + } + } + + IMAGE_DATA_TREE = {b'moov': {b'udta': {b'meta': {b'ilst': { + b'covr': {b'data': Parser.make_data_atom_parser('_image_data')}, + }}}}} + + VERSIONED_ATOMS = {b'meta', b'stsd'} # those have an extra 4 byte header + FLAGGED_ATOMS = {b'stsd'} # these also have an extra 4 byte header + + def _determine_duration(self, fh): + self._traverse_atoms(fh, path=self.AUDIO_DATA_TREE) + + def _parse_tag(self, fh): + self._traverse_atoms(fh, path=self.META_DATA_TREE) + if self._load_image: # A bit inefficient, we rewind the file + self._filehandler.seek(0) # to parse it again for the image + self._traverse_atoms(fh, path=self.IMAGE_DATA_TREE) + + def _traverse_atoms(self, fh, path, stop_pos=None, curr_path=None): + header_size = 8 + atom_header = fh.read(header_size) + while len(atom_header) == header_size: + atom_size = struct.unpack('>I', atom_header[:4])[0] - header_size + atom_type = atom_header[4:] + if curr_path is None: # keep track how we traversed in the tree + curr_path = [atom_type] + if atom_size <= 0: # empty atom, jump to next one + atom_header = fh.read(header_size) + continue + if DEBUG: + stderr('%s pos: %d atom: %s len: %d' % + (' ' * 4 * len(curr_path), fh.tell() - header_size, atom_type, + atom_size + header_size)) + if atom_type in self.VERSIONED_ATOMS: # jump atom version for now + fh.seek(4, os.SEEK_CUR) + if atom_type in self.FLAGGED_ATOMS: # jump atom flags for now + fh.seek(4, os.SEEK_CUR) + sub_path = path.get(atom_type, None) + # if the path leaf is a dict, traverse deeper into the tree: + if issubclass(type(sub_path), MutableMapping): + atom_end_pos = fh.tell() + atom_size + self._traverse_atoms(fh, path=sub_path, stop_pos=atom_end_pos, + curr_path=curr_path + [atom_type]) + # if the path-leaf is a callable, call it on the atom data + elif callable(sub_path): + for fieldname, value in sub_path(fh.read(atom_size)).items(): + if DEBUG: + stderr(' ' * 4 * len(curr_path), 'FIELD: ', fieldname) + if fieldname: + self._set_field(fieldname, value) + # if no action was specified using dict or callable, jump over atom + else: + fh.seek(atom_size, os.SEEK_CUR) + # check if we have reached the end of this branch: + if stop_pos and fh.tell() >= stop_pos: + return # return to parent (next parent node in tree) + atom_header = fh.read(header_size) # read next atom + + +class ID3(TinyTag): + FRAME_ID_TO_FIELD = { # Mapping from Frame ID to a field of the TinyTag + 'COMM': 'comment', 'COM': 'comment', + 'TRCK': 'track', 'TRK': 'track', + 'TYER': 'year', 'TYE': 'year', 'TDRC': 'year', + 'TALB': 'album', 'TAL': 'album', + 'TPE1': 'artist', 'TP1': 'artist', + 'TIT2': 'title', 'TT2': 'title', + 'TCON': 'genre', 'TCO': 'genre', + 'TPOS': 'disc', + 'TPE2': 'albumartist', 'TCOM': 'composer', + 'WXXX': 'extra.url', + 'TSRC': 'extra.isrc', + 'TXXX': 'extra.text', + 'TKEY': 'extra.initial_key', + 'USLT': 'extra.lyrics', + } + IMAGE_FRAME_IDS = {'APIC', 'PIC'} + PARSABLE_FRAME_IDS = set(FRAME_ID_TO_FIELD.keys()).union(IMAGE_FRAME_IDS) + _MAX_ESTIMATION_SEC = 30 + _CBR_DETECTION_FRAME_COUNT = 5 + _USE_XING_HEADER = True # much faster, but can be deactivated for testing + + ID3V1_GENRES = [ + 'Blues', 'Classic Rock', 'Country', 'Dance', 'Disco', + 'Funk', 'Grunge', 'Hip-Hop', 'Jazz', 'Metal', 'New Age', 'Oldies', + 'Other', 'Pop', 'R&B', 'Rap', 'Reggae', 'Rock', 'Techno', 'Industrial', + 'Alternative', 'Ska', 'Death Metal', 'Pranks', 'Soundtrack', + 'Euro-Techno', 'Ambient', 'Trip-Hop', 'Vocal', 'Jazz+Funk', 'Fusion', + 'Trance', 'Classical', 'Instrumental', 'Acid', 'House', 'Game', + 'Sound Clip', 'Gospel', 'Noise', 'AlternRock', 'Bass', 'Soul', 'Punk', + 'Space', 'Meditative', 'Instrumental Pop', 'Instrumental Rock', + 'Ethnic', 'Gothic', 'Darkwave', 'Techno-Industrial', 'Electronic', + 'Pop-Folk', 'Eurodance', 'Dream', 'Southern Rock', 'Comedy', 'Cult', + 'Gangsta', 'Top 40', 'Christian Rap', 'Pop/Funk', 'Jungle', + 'Native American', 'Cabaret', 'New Wave', 'Psychadelic', 'Rave', + 'Showtunes', 'Trailer', 'Lo-Fi', 'Tribal', 'Acid Punk', 'Acid Jazz', + 'Polka', 'Retro', 'Musical', 'Rock & Roll', 'Hard Rock', + + # Wimamp Extended Genres + 'Folk', 'Folk-Rock', 'National Folk', 'Swing', 'Fast Fusion', 'Bebob', + 'Latin', 'Revival', 'Celtic', 'Bluegrass', 'Avantgarde', 'Gothic Rock', + 'Progressive Rock', 'Psychedelic Rock', 'Symphonic Rock', 'Slow Rock', + 'Big Band', 'Chorus', 'Easy Listening', 'Acoustic', 'Humour', 'Speech', + 'Chanson', 'Opera', 'Chamber Music', 'Sonata', 'Symphony', 'Booty Bass', + 'Primus', 'Porn Groove', 'Satire', 'Slow Jam', 'Club', 'Tango', 'Samba', + 'Folklore', 'Ballad', 'Power Ballad', 'Rhythmic Soul', 'Freestyle', + 'Duet', 'Punk Rock', 'Drum Solo', 'A capella', 'Euro-House', + 'Dance Hall', 'Goa', 'Drum & Bass', + + # according to https://de.wikipedia.org/wiki/Liste_der_ID3v1-Genres: + 'Club-House', 'Hardcore Techno', 'Terror', 'Indie', 'BritPop', + '', # don't use ethnic slur ("Negerpunk", WTF!) + 'Polsk Punk', 'Beat', 'Christian Gangsta Rap', 'Heavy Metal', + 'Black Metal', 'Contemporary Christian', 'Christian Rock', + # WinAmp 1.91 + 'Merengue', 'Salsa', 'Thrash Metal', 'Anime', 'Jpop', 'Synthpop', + # WinAmp 5.6 + 'Abstract', 'Art Rock', 'Baroque', 'Bhangra', 'Big Beat', 'Breakbeat', + 'Chillout', 'Downtempo', 'Dub', 'EBM', 'Eclectic', 'Electro', + 'Electroclash', 'Emo', 'Experimental', 'Garage', 'Illbient', + 'Industro-Goth', 'Jam Band', 'Krautrock', 'Leftfield', 'Lounge', + 'Math Rock', 'New Romantic', 'Nu-Breakz', 'Post-Punk', 'Post-Rock', + 'Psytrance', 'Shoegaze', 'Space Rock', 'Trop Rock', 'World Music', + 'Neoclassical', 'Audiobook', 'Audio Theatre', 'Neue Deutsche Welle', + 'Podcast', 'Indie Rock', 'G-Funk', 'Dubstep', 'Garage Rock', 'Psybient', + ] + + def __init__(self, filehandler, filesize, *args, **kwargs): + TinyTag.__init__(self, filehandler, filesize, *args, **kwargs) + # save position after the ID3 tag for duration measurement speedup + self._bytepos_after_id3v2 = None + + @classmethod + def set_estimation_precision(cls, estimation_in_seconds): + cls._MAX_ESTIMATION_SEC = estimation_in_seconds + + # see this page for the magic values used in mp3: + # http://www.mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm + samplerates = [ + [11025, 12000, 8000], # MPEG 2.5 + [], # reserved + [22050, 24000, 16000], # MPEG 2 + [44100, 48000, 32000], # MPEG 1 + ] + v1l1 = [0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448, 0] + v1l2 = [0, 32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 384, 0] + v1l3 = [0, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320, 0] + v2l1 = [0, 32, 48, 56, 64, 80, 96, 112, 128, 144, 160, 176, 192, 224, 256, 0] + v2l2 = [0, 8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144, 160, 0] + v2l3 = v2l2 + bitrate_by_version_by_layer = [ + [None, v2l3, v2l2, v2l1], # MPEG Version 2.5 # note that the layers go + None, # reserved # from 3 to 1 by design. + [None, v2l3, v2l2, v2l1], # MPEG Version 2 # the first layer id is + [None, v1l3, v1l2, v1l1], # MPEG Version 1 # reserved + ] + samples_per_frame = 1152 # the default frame size for mp3 + channels_per_channel_mode = [ + 2, # 00 Stereo + 2, # 01 Joint stereo (Stereo) + 2, # 10 Dual channel (2 mono channels) + 1, # 11 Single channel (Mono) + ] + + @staticmethod + def _parse_xing_header(fh): + # see: http://www.mp3-tech.org/programmer/sources/vbrheadersdk.zip + fh.seek(4, os.SEEK_CUR) # read over Xing header + header_flags = struct.unpack('>i', fh.read(4))[0] + frames = byte_count = toc = vbr_scale = None + if header_flags & 1: # FRAMES FLAG + frames = struct.unpack('>i', fh.read(4))[0] + if header_flags & 2: # BYTES FLAG + byte_count = struct.unpack('>i', fh.read(4))[0] + if header_flags & 4: # TOC FLAG + toc = [struct.unpack('>i', fh.read(4))[0] for _ in range(25)] # 100 bytes + if header_flags & 8: # VBR SCALE FLAG + vbr_scale = struct.unpack('>i', fh.read(4))[0] + return frames, byte_count, toc, vbr_scale + + def _determine_duration(self, fh): + # if tag reading was disabled, find start position of audio data + if self._bytepos_after_id3v2 is None: + self._parse_id3v2_header(fh) + + max_estimation_frames = (ID3._MAX_ESTIMATION_SEC * 44100) // ID3.samples_per_frame + frame_size_accu = 0 + header_bytes = 4 + frames = 0 # count frames for determining mp3 duration + bitrate_accu = 0 # add up bitrates to find average bitrate to detect + last_bitrates = [] # CBR mp3s (multiple frames with same bitrates) + # seek to first position after id3 tag (speedup for large header) + fh.seek(self._bytepos_after_id3v2) + while True: + # reading through garbage until 11 '1' sync-bits are found + b = fh.peek(4) + if len(b) < 4: + if frames: + self.bitrate = bitrate_accu / frames + break # EOF + sync, conf, bitrate_freq, rest = struct.unpack('BBBB', b[0:4]) + br_id = (bitrate_freq >> 4) & 0x0F # biterate id + sr_id = (bitrate_freq >> 2) & 0x03 # sample rate id + padding = 1 if bitrate_freq & 0x02 > 0 else 0 + mpeg_id = (conf >> 3) & 0x03 + layer_id = (conf >> 1) & 0x03 + channel_mode = (rest >> 6) & 0x03 + # check for eleven 1s, validate bitrate and sample rate + if (not b[:2] > b'\xFF\xE0' or br_id > 14 or br_id == 0 or sr_id == 3 + or layer_id == 0 or mpeg_id == 1): # noqa + idx = b.find(b'\xFF', 1) # invalid frame, find next sync header + if idx == -1: + idx = len(b) # not found: jump over the current peek buffer + fh.seek(max(idx, 1), os.SEEK_CUR) + continue + try: + self.channels = self.channels_per_channel_mode[channel_mode] + frame_bitrate = ID3.bitrate_by_version_by_layer[mpeg_id][layer_id][br_id] + self.samplerate = ID3.samplerates[mpeg_id][sr_id] + except (IndexError, TypeError): + raise TinyTagException('mp3 parsing failed') + # There might be a xing header in the first frame that contains + # all the info we need, otherwise parse multiple frames to find the + # accurate average bitrate + if frames == 0 and ID3._USE_XING_HEADER: + xing_header_offset = b.find(b'Xing') + if xing_header_offset != -1: + fh.seek(xing_header_offset, os.SEEK_CUR) + xframes, byte_count, toc, vbr_scale = ID3._parse_xing_header(fh) + if xframes and xframes != 0 and byte_count: + # MPEG-2 Audio Layer III uses 576 samples per frame + samples_per_frame = 576 if mpeg_id <= 2 else ID3.samples_per_frame + self.duration = xframes * samples_per_frame / float(self.samplerate) + # self.duration = (xframes * ID3.samples_per_frame / self.samplerate + # / self.channels) # noqa + self.bitrate = byte_count * 8 / self.duration / 1000 + self.audio_offset = fh.tell() + return + continue + + frames += 1 # it's most probably an mp3 frame + bitrate_accu += frame_bitrate + if frames == 1: + self.audio_offset = fh.tell() + if frames <= ID3._CBR_DETECTION_FRAME_COUNT: + last_bitrates.append(frame_bitrate) + fh.seek(4, os.SEEK_CUR) # jump over peeked bytes + + frame_length = (144000 * frame_bitrate) // self.samplerate + padding + frame_size_accu += frame_length + # if bitrate does not change over time its probably CBR + is_cbr = (frames == ID3._CBR_DETECTION_FRAME_COUNT and len(set(last_bitrates)) == 1) + if frames == max_estimation_frames or is_cbr: + # try to estimate duration + fh.seek(-128, 2) # jump to last byte (leaving out id3v1 tag) + audio_stream_size = fh.tell() - self.audio_offset + est_frame_count = audio_stream_size / (frame_size_accu / frames) + samples = est_frame_count * ID3.samples_per_frame + self.duration = samples / self.samplerate + self.bitrate = bitrate_accu / frames + return + + if frame_length > 1: # jump over current frame body + fh.seek(frame_length - header_bytes, os.SEEK_CUR) + if self.samplerate: + self.duration = frames * ID3.samples_per_frame / self.samplerate + + def _parse_tag(self, fh): + self._parse_id3v2(fh) + attrs = ['track', 'track_total', 'title', 'artist', 'album', 'albumartist', 'year', 'genre'] + has_all_tags = all(getattr(self, attr) for attr in attrs) + if not has_all_tags and self.filesize > 128: + fh.seek(-128, os.SEEK_END) # try parsing id3v1 in last 128 bytes + self._parse_id3v1(fh) + + def _parse_id3v2_header(self, fh): + size, extended, major = 0, None, None + # for info on the specs, see: http://id3.org/Developer%20Information + header = struct.unpack('3sBBB4B', _read(fh, 10)) + tag = codecs.decode(header[0], 'ISO-8859-1') + # check if there is an ID3v2 tag at the beginning of the file + if tag == 'ID3': + major, rev = header[1:3] + if DEBUG: + stderr('Found id3 v2.%s' % major) + # unsync = (header[3] & 0x80) > 0 + extended = (header[3] & 0x40) > 0 + # experimental = (header[3] & 0x20) > 0 + # footer = (header[3] & 0x10) > 0 + size = self._calc_size(header[4:8], 7) + self._bytepos_after_id3v2 = size + return size, extended, major + + def _parse_id3v2(self, fh): + size, extended, major = self._parse_id3v2_header(fh) + if size: + end_pos = fh.tell() + size + parsed_size = 0 + if extended: # just read over the extended header. + size_bytes = struct.unpack('4B', _read(fh, 6)[0:4]) + extd_size = self._calc_size(size_bytes, 7) + fh.seek(extd_size - 6, os.SEEK_CUR) # jump over extended_header + while parsed_size < size: + frame_size = self._parse_frame(fh, id3version=major) + if frame_size == 0: + break + parsed_size += frame_size + fh.seek(end_pos, os.SEEK_SET) + + def _parse_id3v1(self, fh): + if fh.read(3) == b'TAG': # check if this is an ID3 v1 tag + def asciidecode(x): + return self._unpad(codecs.decode(x, self._default_encoding or 'latin1')) + fields = fh.read(30 + 30 + 30 + 4 + 30 + 1) + self._set_field('title', asciidecode(fields[:30]), overwrite=False) + self._set_field('artist', asciidecode(fields[30:60]), overwrite=False) + self._set_field('album', asciidecode(fields[60:90]), overwrite=False) + self._set_field('year', asciidecode(fields[90:94]), overwrite=False) + comment = fields[94:124] + if b'\x00\x00' < comment[-2:] < b'\x01\x00': + self._set_field('track', str(ord(comment[-1:])), overwrite=False) + comment = comment[:-2] + self._set_field('comment', asciidecode(comment), overwrite=False) + genre_id = ord(fields[124:125]) + if genre_id < len(ID3.ID3V1_GENRES): + self._set_field('genre', ID3.ID3V1_GENRES[genre_id], overwrite=False) + + @staticmethod + def index_utf16(s, search): + for i in range(0, len(s), len(search)): + if s[i:i + len(search)] == search: + return i + return -1 + + def _parse_frame(self, fh, id3version=False): + # ID3v2.2 especially ugly. see: http://id3.org/id3v2-00 + frame_header_size = 6 if id3version == 2 else 10 + frame_size_bytes = 3 if id3version == 2 else 4 + binformat = '3s3B' if id3version == 2 else '4s4B2B' + bits_per_byte = 7 if id3version == 4 else 8 # only id3v2.4 is synchsafe + frame_header_data = fh.read(frame_header_size) + if len(frame_header_data) != frame_header_size: + return 0 + frame = struct.unpack(binformat, frame_header_data) + frame_id = self._decode_string(frame[0]) + frame_size = self._calc_size(frame[1:1 + frame_size_bytes], bits_per_byte) + if DEBUG: + stderr('Found id3 Frame %s at %d-%d of %d' % + (frame_id, fh.tell(), fh.tell() + frame_size, self.filesize)) + if frame_size > 0: + # flags = frame[1+frame_size_bytes:] # dont care about flags. + if frame_id not in ID3.PARSABLE_FRAME_IDS: # jump over unparsable frames + fh.seek(frame_size, os.SEEK_CUR) + return frame_size + content = fh.read(frame_size) + fieldname = ID3.FRAME_ID_TO_FIELD.get(frame_id) + if fieldname: + language = fieldname in ("comment", "extra.lyrics") + self._set_field(fieldname, self._decode_string(content, language)) + elif frame_id in self.IMAGE_FRAME_IDS and self._load_image: + # See section 4.14: http://id3.org/id3v2.4.0-frames + encoding = content[0:1] + if frame_id == 'PIC': # ID3 v2.2: + desc_start_pos = 1 + 3 + 1 # skip encoding (1), imgformat (3), pictype(1) + else: # ID3 v2.3+ + desc_start_pos = content.index(b'\x00', 1) + 1 + 1 # skip mimetype, pictype(1) + # latin1 and utf-8 are 1 byte + termination = b'\x00' if encoding in (b'\x00', b'\x03') else b'\x00\x00' + desc_length = ID3.index_utf16(content[desc_start_pos:], termination) + desc_end_pos = desc_start_pos + desc_length + len(termination) + self._image_data = content[desc_end_pos:] + return frame_size + return 0 + + def _decode_string(self, bytestr, language=False): + default_encoding = 'ISO-8859-1' + if self._default_encoding: + default_encoding = self._default_encoding + try: # it's not my fault, this is the spec. + first_byte = bytestr[:1] + if first_byte == b'\x00': # ISO-8859-1 + bytestr = bytestr[1:] + encoding = default_encoding + elif first_byte == b'\x01': # UTF-16 with BOM + bytestr = bytestr[1:] + # remove language (but leave BOM) + if language: + if bytestr[3:5] in (b'\xfe\xff', b'\xff\xfe'): + bytestr = bytestr[3:] + if bytestr[:3].isalpha() and bytestr[3:4] == b'\x00': + bytestr = bytestr[4:] # remove language + if bytestr[:1] == b'\x00': + bytestr = bytestr[1:] # strip optional additional null byte + # read byte order mark to determine endianness + encoding = 'UTF-16be' if bytestr[0:2] == b'\xfe\xff' else 'UTF-16le' + # strip the bom if it exists + if bytestr[:2] in (b'\xfe\xff', b'\xff\xfe'): + bytestr = bytestr[2:] if len(bytestr) % 2 == 0 else bytestr[2:-1] + # remove ADDITIONAL EXTRA BOM :facepalm: + if bytestr[:4] == b'\x00\x00\xff\xfe': + bytestr = bytestr[4:] + elif first_byte == b'\x02': # UTF-16LE + # strip optional null byte, if byte count uneven + bytestr = bytestr[1:-1] if len(bytestr) % 2 == 0 else bytestr[1:] + encoding = 'UTF-16le' + elif first_byte == b'\x03': # UTF-8 + bytestr = bytestr[1:] + encoding = 'UTF-8' + else: + bytestr = bytestr + encoding = default_encoding # wild guess + if language and bytestr[:3].isalpha() and bytestr[3:4] == b'\x00': + bytestr = bytestr[4:] # remove language + errors = 'ignore' if self._ignore_errors else 'strict' + return self._unpad(codecs.decode(bytestr, encoding, errors)) + except UnicodeDecodeError: + raise TinyTagException('Error decoding ID3 Tag!') + + def _calc_size(self, bytestr, bits_per_byte): + # length of some mp3 header fields is described by 7 or 8-bit-bytes + return reduce(lambda accu, elem: (accu << bits_per_byte) + elem, bytestr, 0) + + +class Ogg(TinyTag): + def __init__(self, filehandler, filesize, *args, **kwargs): + TinyTag.__init__(self, filehandler, filesize, *args, **kwargs) + self._tags_parsed = False + self._max_samplenum = 0 # maximum sample position ever read + + def _determine_duration(self, fh): + max_page_size = 65536 # https://xiph.org/ogg/doc/libogg/ogg_page.html + if not self._tags_parsed: + self._parse_tag(fh) # determine sample rate + fh.seek(0) # and rewind to start + if self.filesize > max_page_size: + fh.seek(-max_page_size, 2) # go to last possible page position + while True: + b = fh.peek(4) + if len(b) == 0: + return # EOF + if b[:4] == b'OggS': # look for an ogg header + for _ in self._parse_pages(fh): + pass # parse all remaining pages + self.duration = self._max_samplenum / self.samplerate + else: + idx = b.find(b'OggS') # try to find header in peeked data + seekpos = idx if idx != -1 else len(b) - 3 + fh.seek(max(seekpos, 1), os.SEEK_CUR) + + def _parse_tag(self, fh): + page_start_pos = fh.tell() # set audio_offset later if its audio data + for packet in self._parse_pages(fh): + walker = BytesIO(packet) + if packet[0:7] == b"\x01vorbis": + (channels, self.samplerate, max_bitrate, bitrate, + min_bitrate) = struct.unpack(" 0: + fh.seek(remaining_size, 1) # skip remaining data in chunk + elif subchunkid == b'data': + self.duration = subchunksize / self.channels / self.samplerate / (self.bitdepth / 8) + self.audio_offset = fh.tell() - 8 # rewind to data header + fh.seek(subchunksize, 1) + elif subchunkid == b'LIST' and self._parse_tags: + is_info = fh.read(4) # check INFO header + if is_info != b'INFO': # jump over non-INFO sections + fh.seek(subchunksize - 4, os.SEEK_CUR) + else: + sub_fh = BytesIO(fh.read(subchunksize - 4)) + field = sub_fh.read(4) + while len(field) == 4: + data_length = struct.unpack('I', sub_fh.read(4))[0] + data_length += data_length % 2 # IFF chunks are padded to an even size + data = sub_fh.read(data_length).split(b'\x00', 1)[0] # strip zero-byte + fieldname = self.riff_mapping.get(field) + if fieldname: + self._set_field(fieldname, codecs.decode(data, 'utf-8')) + field = sub_fh.read(4) + elif subchunkid in (b'id3 ', b'ID3 ') and self._parse_tags: + id3 = ID3(fh, 0) + id3._parse_id3v2(fh) + self.update(id3) + else: # some other chunk, just skip the data + fh.seek(subchunksize, 1) + chunk_header = fh.read(8) + self._duration_parsed = True + + def _parse_tag(self, fh): + if not self._duration_parsed: + self._determine_duration(fh) # parse whole file to determine tags:( + + +class Flac(TinyTag): + METADATA_STREAMINFO = 0 + METADATA_PADDING = 1 + METADATA_APPLICATION = 2 + METADATA_SEEKTABLE = 3 + METADATA_VORBIS_COMMENT = 4 + METADATA_CUESHEET = 5 + METADATA_PICTURE = 6 + + def load(self, tags, duration, image=False): + self._parse_tags = tags + self._load_image = image + header = self._filehandler.peek(4) + if header[:3] == b'ID3': # parse ID3 header if it exists + id3 = ID3(self._filehandler, 0) + id3._parse_id3v2(self._filehandler) + self.update(id3) + header = self._filehandler.peek(4) # after ID3 should be fLaC + if header[:4] != b'fLaC': + raise TinyTagException('Invalid flac header') + self._filehandler.seek(4, os.SEEK_CUR) + self._determine_duration(self._filehandler) + + def _determine_duration(self, fh): + # for spec, see https://xiph.org/flac/ogg_mapping.html + header_data = fh.read(4) + while len(header_data): + meta_header = struct.unpack('B3B', header_data) + block_type = meta_header[0] & 0x7f + is_last_block = meta_header[0] & 0x80 + size = _bytes_to_int(meta_header[1:4]) + # http://xiph.org/flac/format.html#metadata_block_streaminfo + if block_type == Flac.METADATA_STREAMINFO: + stream_info_header = fh.read(size) + if len(stream_info_header) < 34: # invalid streaminfo + return + header = struct.unpack('HH3s3s8B16s', stream_info_header) + # From the xiph documentation: + # py | + # ---------------------------------------------- + # H | <16> The minimum block size (in samples) + # H | <16> The maximum block size (in samples) + # 3s | <24> The minimum frame size (in bytes) + # 3s | <24> The maximum frame size (in bytes) + # 8B | <20> Sample rate in Hz. + # | <3> (number of channels)-1. + # | <5> (bits per sample)-1. + # | <36> Total samples in stream. + # 16s| <128> MD5 signature + # min_blk, max_blk, min_frm, max_frm = header[0:4] + # min_frm = _bytes_to_int(struct.unpack('3B', min_frm)) + # max_frm = _bytes_to_int(struct.unpack('3B', max_frm)) + # channels--. bits total samples + # |----- samplerate -----| |-||----| |---------~ ~----| + # 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 + # #---4---# #---5---# #---6---# #---7---# #--8-~ ~-12-# + self.samplerate = _bytes_to_int(header[4:7]) >> 4 + self.channels = ((header[6] >> 1) & 0x07) + 1 + self.bitdepth = (((header[6] & 1) << 4) + ((header[7] & 0xF0) >> 4) + 1) + total_sample_bytes = [(header[7] & 0x0F)] + list(header[8:12]) + total_samples = _bytes_to_int(total_sample_bytes) + self.duration = total_samples / self.samplerate + if self.duration > 0: + self.bitrate = self.filesize / self.duration * 8 / 1000 + elif block_type == Flac.METADATA_VORBIS_COMMENT and self._parse_tags: + oggtag = Ogg(fh, 0) + oggtag._parse_vorbis_comment(fh) + self.update(oggtag) + elif block_type == Flac.METADATA_PICTURE and self._load_image: + self._image_data = self._parse_image(fh) + elif block_type >= 127: + return # invalid block type + else: + if DEBUG: + stderr('Unknown FLAC block type', block_type) + fh.seek(size, 1) # seek over this block + + if is_last_block: + return + header_data = fh.read(4) + + @staticmethod + def _parse_image(fh): + # https://xiph.org/flac/format.html#metadata_block_picture + pic_type, mime_len = struct.unpack('>2I', fh.read(8)) + fh.read(mime_len) + description_len = struct.unpack('>I', fh.read(4))[0] + fh.read(description_len) + width, height, depth, colors, pic_len = struct.unpack('>5I', fh.read(20)) + return fh.read(pic_len) + + +class Wma(TinyTag): + ASF_CONTENT_DESCRIPTION_OBJECT = b'3&\xb2u\x8ef\xcf\x11\xa6\xd9\x00\xaa\x00b\xcel' + ASF_EXTENDED_CONTENT_DESCRIPTION_OBJECT = (b'@\xa4\xd0\xd2\x07\xe3\xd2\x11\x97\xf0\x00' + b'\xa0\xc9^\xa8P') + STREAM_BITRATE_PROPERTIES_OBJECT = b'\xceu\xf8{\x8dF\xd1\x11\x8d\x82\x00`\x97\xc9\xa2\xb2' + ASF_FILE_PROPERTY_OBJECT = b'\xa1\xdc\xab\x8cG\xa9\xcf\x11\x8e\xe4\x00\xc0\x0c Se' + ASF_STREAM_PROPERTIES_OBJECT = b'\x91\x07\xdc\xb7\xb7\xa9\xcf\x11\x8e\xe6\x00\xc0\x0c Se' + STREAM_TYPE_ASF_AUDIO_MEDIA = b'@\x9ei\xf8M[\xcf\x11\xa8\xfd\x00\x80_\\D+' + # see: + # http://web.archive.org/web/20131203084402/http://msdn.microsoft.com/en-us/library/bb643323.aspx + # and (japanese, but none the less helpful) + # http://uguisu.skr.jp/Windows/format_asf.html + + def __init__(self, filehandler, filesize, *args, **kwargs): + TinyTag.__init__(self, filehandler, filesize, *args, **kwargs) + self.__tag_parsed = False + + def _determine_duration(self, fh): + if not self.__tag_parsed: + self._parse_tag(fh) + + def read_blocks(self, fh, blocks): + # blocks are a list(tuple('fieldname', byte_count, cast_int), ...) + decoded = {} + for block in blocks: + val = fh.read(block[1]) + if block[2]: + val = _bytes_to_int_le(val) + decoded[block[0]] = val + return decoded + + def __bytes_to_guid(self, obj_id_bytes): + return '-'.join([ + hex(_bytes_to_int_le(obj_id_bytes[:-12]))[2:].zfill(6), + hex(_bytes_to_int_le(obj_id_bytes[-12:-10]))[2:].zfill(4), + hex(_bytes_to_int_le(obj_id_bytes[-10:-8]))[2:].zfill(4), + hex(_bytes_to_int(obj_id_bytes[-8:-6]))[2:].zfill(4), + hex(_bytes_to_int(obj_id_bytes[-6:]))[2:].zfill(12), + ]) + + def __decode_string(self, bytestring): + return self._unpad(codecs.decode(bytestring, 'utf-16')) + + def __decode_ext_desc(self, value_type, value): + """ decode ASF_EXTENDED_CONTENT_DESCRIPTION_OBJECT values""" + if value_type == 0: # Unicode string + return self.__decode_string(value) + elif value_type == 1: # BYTE array + return value + elif 1 < value_type < 6: # DWORD / QWORD / WORD + return _bytes_to_int_le(value) + + def _parse_tag(self, fh): + self.__tag_parsed = True + guid = fh.read(16) # 128 bit GUID + if guid != b'0&\xb2u\x8ef\xcf\x11\xa6\xd9\x00\xaa\x00b\xcel': + # not a valid ASF container! see: http://www.garykessler.net/library/file_sigs.html + return + struct.unpack('Q', fh.read(8))[0] # size + struct.unpack('I', fh.read(4))[0] # obj_count + if fh.read(2) != b'\x01\x02': + # http://web.archive.org/web/20131203084402/http://msdn.microsoft.com/en-us/library/bb643323.aspx#_Toc521913958 + return # not a valid asf header! + while True: + object_id = fh.read(16) + object_size = _bytes_to_int_le(fh.read(8)) + if object_size == 0 or object_size > self.filesize: + break # invalid object, stop parsing. + if object_id == Wma.ASF_CONTENT_DESCRIPTION_OBJECT and self._parse_tags: + len_blocks = self.read_blocks(fh, [ + ('title_length', 2, True), + ('author_length', 2, True), + ('copyright_length', 2, True), + ('description_length', 2, True), + ('rating_length', 2, True), + ]) + data_blocks = self.read_blocks(fh, [ + ('title', len_blocks['title_length'], False), + ('artist', len_blocks['author_length'], False), + ('', len_blocks['copyright_length'], True), + ('comment', len_blocks['description_length'], False), + ('', len_blocks['rating_length'], True), + ]) + for field_name, bytestring in data_blocks.items(): + if field_name: + self._set_field(field_name, self.__decode_string(bytestring)) + elif object_id == Wma.ASF_EXTENDED_CONTENT_DESCRIPTION_OBJECT and self._parse_tags: + mapping = { + 'WM/TrackNumber': 'track', + 'WM/PartOfSet': 'disc', + 'WM/Year': 'year', + 'WM/AlbumArtist': 'albumartist', + 'WM/Genre': 'genre', + 'WM/AlbumTitle': 'album', + 'WM/Composer': 'composer', + } + # http://web.archive.org/web/20131203084402/http://msdn.microsoft.com/en-us/library/bb643323.aspx#_Toc509555195 + descriptor_count = _bytes_to_int_le(fh.read(2)) + for _ in range(descriptor_count): + name_len = _bytes_to_int_le(fh.read(2)) + name = self.__decode_string(fh.read(name_len)) + value_type = _bytes_to_int_le(fh.read(2)) + value_len = _bytes_to_int_le(fh.read(2)) + value = fh.read(value_len) + field_name = mapping.get(name) + if field_name: + field_value = self.__decode_ext_desc(value_type, value) + self._set_field(field_name, field_value) + elif object_id == Wma.ASF_FILE_PROPERTY_OBJECT: + blocks = self.read_blocks(fh, [ + ('file_id', 16, False), + ('file_size', 8, False), + ('creation_date', 8, True), + ('data_packets_count', 8, True), + ('play_duration', 8, True), + ('send_duration', 8, True), + ('preroll', 8, True), + ('flags', 4, False), + ('minimum_data_packet_size', 4, True), + ('maximum_data_packet_size', 4, True), + ('maximum_bitrate', 4, False), + ]) + # According to the specification, we need to subtract the preroll from play_duration + # to get the actual duration of the file + preroll = blocks.get('preroll') / 1000 + self.duration = max(blocks.get('play_duration') / 10000000 - preroll, 0.0) + elif object_id == Wma.ASF_STREAM_PROPERTIES_OBJECT: + blocks = self.read_blocks(fh, [ + ('stream_type', 16, False), + ('error_correction_type', 16, False), + ('time_offset', 8, True), + ('type_specific_data_length', 4, True), + ('error_correction_data_length', 4, True), + ('flags', 2, True), + ('reserved', 4, False) + ]) + already_read = 0 + if blocks['stream_type'] == Wma.STREAM_TYPE_ASF_AUDIO_MEDIA: + stream_info = self.read_blocks(fh, [ + ('codec_id_format_tag', 2, True), + ('number_of_channels', 2, True), + ('samples_per_second', 4, True), + ('avg_bytes_per_second', 4, True), + ('block_alignment', 2, True), + ('bits_per_sample', 2, True), + ]) + self.samplerate = stream_info['samples_per_second'] + self.bitrate = stream_info['avg_bytes_per_second'] * 8 / 1000 + if stream_info['codec_id_format_tag'] == 355: # lossless + self.bitdepth = stream_info['bits_per_sample'] + already_read = 16 + fh.seek(blocks['type_specific_data_length'] - already_read, os.SEEK_CUR) + fh.seek(blocks['error_correction_data_length'], os.SEEK_CUR) + else: + fh.seek(object_size - 24, os.SEEK_CUR) # read over onknown object ids + + +class Aiff(ID3): + # + # AIFF is part of the IFF family of file formats. + # + # https://en.wikipedia.org/wiki/Audio_Interchange_File_Format#Data_format + # https://web.archive.org/web/20171118222232/http://www-mmsp.ece.mcgill.ca/documents/audioformats/aiff/aiff.html + # https://web.archive.org/web/20071219035740/http://www.cnpbagwell.com/aiff-c.txt + # + # A few things about the spec: + # + # * IFF strings are not supposed to be null terminated. They sometimes are. + # * Some tools might throw more metadata into the ANNO chunk but it is + # wildly unreliable to count on it. In fact, the official spec recommends against + # using it. That said... this code throws the ANNO field into comment and hopes + # for the best. + # + # The key thing here is that AIFF metadata is usually in a handful of fields + # and the rest is an ID3 or XMP field. XMP is too complicated and only Adobe-related + # products support it. The vast majority use ID3. As such, this code inherits from + # ID3 rather than TinyTag since it does everything that needs to be done here. + # + + aiff_mapping = { + # + # "Name Chunk text contains the name of the sampled sound." + # + # "Author Chunk text contains one or more author names. An author in + # this case is the creator of a sampled sound." + # + # "Annotation Chunk text contains a comment. Use of this chunk is + # discouraged within FORM AIFC." Some tools: "hold my beer" + # + # "The Copyright Chunk contains a copyright notice for the sound. text + # contains a date followed by the copyright owner. The chunk ID '[c] ' + # serves as the copyright character. " Some tools: "hold my beer" + # + b'NAME': 'title', + b'AUTH': 'artist', + b'ANNO': 'comment', + b'(c) ': 'extra.copyright', + } + + def __init__(self, filehandler, filesize, *args, **kwargs): + ID3.__init__(self, filehandler, filesize, *args, **kwargs) + self._tags_parsed = False + + def _parse_tag(self, fh): + chunk_id, size, form = struct.unpack('>4sI4s', fh.read(12)) + if chunk_id != b'FORM' or form not in (b'AIFC', b'AIFF'): + raise TinyTagException('not an aiff file!') + chunk_header = fh.read(8) + while len(chunk_header) == 8: + sub_chunk_id, sub_chunk_size = struct.unpack('>4sI', chunk_header) + sub_chunk_size += sub_chunk_size % 2 # IFF chunks are padded to an even number of bytes + if sub_chunk_id in self.aiff_mapping and self._parse_tags: + value = self._unpad(fh.read(sub_chunk_size).decode('utf-8')) + self._set_field(self.aiff_mapping[sub_chunk_id], value) + elif sub_chunk_id == b'COMM': + self.channels, num_frames, self.bitdepth = struct.unpack('>hLh', fh.read(8)) + try: + exponent, mantissa = struct.unpack('>HQ', fh.read(10)) # Extended precision + self.samplerate = int(mantissa * (2 ** (exponent - 0x3FFF - 63))) + self.duration = num_frames / self.samplerate + self.bitrate = self.samplerate * self.channels * self.bitdepth / 1000 + except OverflowError: + self.samplerate = self.duration = self.bitrate = None # invalid sample rate + fh.seek(sub_chunk_size - 18, 1) # skip remaining data in chunk + elif sub_chunk_id in (b'id3 ', b'ID3 ') and self._parse_tags: + ID3._parse_tag(self, fh) + elif sub_chunk_id == b'SSND': + self.audio_offset = fh.tell() + fh.seek(sub_chunk_size, 1) + else: # some other chunk, just skip the data + fh.seek(sub_chunk_size, 1) + chunk_header = fh.read(8) + self._tags_parsed = True + + def _determine_duration(self, fh): + if not self._tags_parsed: + self._parse_tag(fh) diff --git a/src/variables.py b/src/variables.py new file mode 100644 index 0000000..2e08d87 --- /dev/null +++ b/src/variables.py @@ -0,0 +1,73 @@ +#!/usr/bin/env python +# +# Musort - A command-line tool for effortlessly organizing and renaming your music files based on metadata +# Copyright (C) 2023 tdeerenberg +# +# Sources on github: +# https://github.com/tdeerenberg/Musort +# +# Licensed under the GNU General Public License v3.0 (GPLv3) +# Copyright (C) 2023 tdeerenberg +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +version_text = "Musort v1.0 (C) tdeerenberg" +supported_formats = ["flac", "mp3", "mp2", "mp1", "opus", "ogg", "wma"] +invalid_characters = ["\\", "/", "|", "*", "<", ">", '"', "'", "?"] +default_separator = '_' +help_text = \ +"""Musort (c) 2023 tdeerenberg (github.com/tdeerenberg) + +DESCRIPTION: +A command-line tool for effortlessly organizing and renaming your music files based on metadata + +USAGE: +musort [DIRECTORY] [OPTIONAL_PARAMETERS] + +USAGE EXAMPLES: + musort ~/music + musort /local/music -f disc.artist.title.album -r + musort ~/my_music -s _ -r + +OPTIONAL OPTIONS: +-h, --help Show the help menu +-f, --format set the naming convention (see 'NAMING CONVENTION:' below) +-s, --separator Set the separator for the filename (ex. '-s .' -> 01.track.flac and '-s -' -> 01-track.mp3) + Default separator '_' will be used if none is given +-r, --recursive Rename files in subdirectories as well +-v, --version Prints the version number + +NAMING CONVENTION: +FORMAT_OPTION.FORMAT_OPTION... The amount of format options does not matter. + It can be one, two, three, even all of them. + (See FORMAT OPTIONS below for all options) +FORMAT OPTIONS: +album album as string +albumartist album artist as string +artist artist name as string +audio_offset number of bytes before audio data begins +bitdepth bit depth for lossless audio +bitrate bitrate in kBits/s +comment file comment as string +composer composer as string +disc disc number +disc_total the total number of discs +duration duration of the song in seconds +filesize file size in bytes +genre genre as string +samplerate samples per second +title title of the song +track track number as string +track_total total number of tracks as string +year year or date as string""" \ No newline at end of file