synbols package¶
Submodules¶
synbols.data_io module¶
-
class
synbols.data_io.
H5Stack
(file, name, n_samples, chunk_size=10, compression='gzip')¶ Bases:
object
-
add
(x)¶
-
-
synbols.data_io.
add_splits
(fd, split_dict, random_seed)¶
-
synbols.data_io.
load_attributes_h5
(file_path)¶ Load the dataset from h5py format
Parameters: file_path – path to the hdf5 dataset Returns: list of length n_samples, containing a dictionary of attributes for each images splits: dict of different type of splits for this dataset. Each split is a binary array of shape (n_samples, n_subset) representing a specific partition. Return type: attributes
-
synbols.data_io.
load_dataset_jpeg_sequential
(file_path, max_samples=None)¶
-
synbols.data_io.
load_h5
(file_path)¶ Load the dataset from h5py format
Parameters: file_path – path to the hdf5 dataset Returns: array of shape (n_samples, width, height, n_channels), containing images mask: array of shape (n_samples, width, height, n_symbols), containing the mask of each symbol in the image attributes: list of length n_samples, containing a dictionary of attributes for each images splits: dict of different type of splits for this dataset. Each split is a list of mask for each subset. Return type: x
-
synbols.data_io.
load_minibatch_h5
(file_path, indices)¶
-
synbols.data_io.
load_npz
(file_path)¶ Load the dataset from compressed numpy format (npz).
-
synbols.data_io.
pack_dataset
(generator)¶ Turn a the output of a generator of (x,y) pairs into a numpy array containing the full dataset
-
synbols.data_io.
write_h5
(file_path, dataset_generator, n_samples, split_function=None, ratios=(0.6, 0.2, 0.2), random_seed=42)¶
-
synbols.data_io.
write_jpg_zip
(directory, generator)¶ Write the dataset in a zipped directory using jpeg and json for each image.
-
synbols.data_io.
write_npz
(file_path, generator)¶
synbols.drawing module¶
-
class
synbols.drawing.
Camouflage
(stroke_length=0.4, stroke_width=0.05, stroke_angle=0.7853981633974483, stroke_noise=0.02, n_stroke=500, seed=None)¶ Bases:
synbols.drawing.RandomPattern
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
to_json
()¶
-
-
class
synbols.drawing.
Gradient
(alpha=1, types=('radial', 'linear'), random_color=None, seed=None)¶ Bases:
synbols.drawing.RandomPattern
Uses linear or radial graidents to render patterns.
-
set_as_source
(ctxt)¶
-
-
class
synbols.drawing.
Image
(symbols, resolution=(32, 32), background=<synbols.drawing.NoPattern object>, inverse_color=False, pixel_noise_scale=0.01, is_gray=False, max_contrast=True, seed=None)¶ Bases:
object
High level class for genrating an image with symbols, based on attributes.
-
symbols
¶ a list of objects of type Symbol
-
resolution
¶ a pair of integer describing the resolution of the image. Defaults to (32, 32).
-
background
¶ an object of type Pattern for rendering the background of the image. Defaults to NoPattern.
-
inverse_color
¶ Boolean, specifying if the colors should be inverted. Defaults to False.
-
pixel_noise_scale
¶ The standard deviation of the pixel noise. Defaults to 0.01.
-
max_contrast
¶ Boolean, specifying if the image contrast should be maximized after rendering. If True, the pixel values will be linearly map to range [0, 1] within an image. Defaults to True.
-
seed
¶ The random seed of an image. For the same seed, the same image will be rendered. Defaults to None.
-
add_symbol
(symbol)¶
-
attribute_dict
()¶
-
make_image
()¶
-
make_mask
()¶
-
-
class
synbols.drawing.
ImagePattern
(root='/images', rotation=0, translation=0.0, crop=True, min_crop_size=0.2, seed=None)¶ Bases:
synbols.drawing.RandomPattern
Uses natural images to render patterns.
Parameters: - root – str, Base path to search for images.
- rotation – float, Maximum random rotation in radian, default 0.
- translation – float, Maximum random translation in proportion, default 1.
- crop – bool, Whether to take a random crop of the image or not, default True.
- min_crop_size – float, Crop’s minimal proportion from the image, default 0.2.
- seed – Optional[int], Random seed to use for transformation, default to None
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
class
synbols.drawing.
MultiGradient
(alpha=0.5, n_gradients=2, types=('radial', 'linear'), random_color=None, seed=None)¶ Bases:
synbols.drawing.RandomPattern
Renders multiple gradient patterns at with transparency.
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
-
class
synbols.drawing.
NoPattern
¶ Bases:
synbols.drawing.Pattern
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
-
class
synbols.drawing.
Pattern
¶ Bases:
object
Base class for all patterns
-
attribute_dict
()¶
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
surface
(width, height)¶
-
-
class
synbols.drawing.
RandomPattern
¶ Bases:
synbols.drawing.Pattern
Base class for patterns using a seed.
-
attribute_dict
()¶
-
-
class
synbols.drawing.
SolidColor
(color=None)¶ Bases:
synbols.drawing.Pattern
Uses fixed color to render pattern.
-
draw
(ctxt)¶
-
set_as_source
(ctxt)¶
-
-
class
synbols.drawing.
Symbol
(alphabet, char, font, foreground, is_slant, is_bold, rotation, scale, translation)¶ Bases:
object
Class containing attributes describing each symbol
-
alphabet
¶ Object of type Alphabet
-
char
¶ string of 1 or more characters in the image
-
font
¶ string describing the font used to draw characters
-
foreground
¶ object of type Pattern, used for the foreground of the symbol
-
is_slant
¶ bool describing if char is italic or not
-
is_bold
¶ bool describing if char is bold or not
-
rotation
¶ float, rotation angle of the text
-
scale
¶ float, scale of the text. A scale of 1 will have the longest extent of the symbol cover the whole image.
-
translation
¶ relative (x, y) translation of the text. A translation in the range [-1, 1] will ensure that the symbol fits entirely in the image. Note if the scale i
-
attribute_dict
()¶ Returns a dict of all attributes of the symbol.
-
draw
(ctxt)¶
-
make_mask
(resolution)¶ Creates a grey scale image corresponding to the mask of the symbol.
-
-
synbols.drawing.
color_sampler
(rng=<module 'numpy.random' from '/home/docs/checkouts/readthedocs.org/user_builds/synbols/envs/latest/lib/python3.7/site-packages/numpy/random/__init__.py'>, brightness_range=(0, 1))¶
-
synbols.drawing.
draw_symbol
(ctxt, attributes)¶ Core function drawing the characters as described in attributes
Parameters: - ctxt – cairo context to draw the image
- attributes – Object of type Symbol
Returns: rectangle containing the text in the coordinate of the context extent_main_char: rectangle containing the central character in the coordinate of the context
Return type: extent
synbols.generate module¶
-
synbols.generate.
add_occlusion
(attr_sampler, n_occlusion=None, occlusion_char=None, rotation=None, scale=None, translation=None, foreground=None)¶ Augment an attribute sampler to add occlusions over the other symbols.
Parameters: - attr_sampler – a callable returning an object of type drawing.Image.
- n_occlusion – integer or a distribution over it. Specifies the number of occlusions to draw.
- Defaults to Uniform([1 .. 5])
- occlusion_char: string or distribution over it.
- Specifies the unicode symbols used to make occlusions.
- Defaults to Uniform([’■’, ‘▲’, ‘●’]).
- rotation: float or distribution over it.
- Rotation of the symbol in radian in the range [-pi .. pi].
- Defaults to Uniform([-pi .. pi]).
- scale: float or distribution over it.
- Scale of the symbol. A scale of 1 will have either the
- width or height cover the whole image. Defaults to
- 0.3* exp(Normal(0, 0.1))
- translation: a pair of float or a distribution over it.
- Numbers between [-1 .. 1] will make sure the symbol stays withing the image i.e. the actual translation
- depends on the remaining space after the symbol is
- scaled. Defaults to Uniform(-1.5, 1.5).
- foreground: object of type drawing.Pattern or distribution over it.
- Defines how the foreground will be rendered.
Defaults to drawing.Gradient :returns: A callable taking an optional seed as an argument and
returning an object of type drawing.Image.
-
synbols.generate.
basic_attribute_sampler
(alphabet=None, char=None, font=None, background=None, foreground=None, is_slant=None, is_bold=None, rotation=None, scale=None, translation=None, inverse_color=None, max_contrast=None, pixel_noise_scale=None, resolution=(32, 32), is_gray=False, n_symbols=1)¶ Returns a function that generates a new Image object on every call.
This function is the high level interface for defining a new distribution over images. On every call, it will return an drawing.Image object, containing every attributes to render the final image into a numpy array. All arguments to this function have a proper default value. When no arguments are passed, this is referred to as the “default” synbols dataset.
All arguments can be either a constant, a callable, or None. If None is passed, the default distribution is used. A callable can be used to define a distribution over the specific argument. This function must take 1 argument specifying the random number generator.
Parameters: alphabet – Object of type utils.Alphabet or a distribution over it. An alphabet can be created easily using Language.get_alphabet(). This argument is only used to specify - the default distributions over char and
- fonts. If these arguments are specified, alphabet is ignored.
- char: string or distribution over strings.
- Defaults to Uniform(alphabet.symbols)
- font: string or distribution over strings.
- Defaults to Uniform(alphabet.fonts)
- background: object of type drawing.Pattern or distribution over it.
- Defines how the background will be rendered.
- Defaults to drawing.Gradient
- foreground: object of type drawing.Pattern or distribution over it.
- Defines how the foreground will be rendered.
- Defaults to drawing.Gradient
- is_slant: bool or distribution over bool.
- Defines if character is drawn italic or normal.
- For wider support, this is done using the a 2D
- transformation instead of relying on the font’s italic.
- Defaults to Uniform{True, False}.
- is_bold: bool or distribution over bool.
- Whether the character is rendered in bold or not.
- Note: Some fonts do not support boldd. In which case, it
- will have no effect. To obtain a collection of font that
- support bold, use Language.get_alphabet(… support_bold=True)
- rotation: float or distribution over it.
- Rotation of the symbol in radian in the range [-pi .. pi].
- Defaults to Normal(0, 0.3).
- scale: float or distribution over it.
- Scale of the symbol. A scale of 1 will have either
- the width or height cover the whole image. Defaults to
- 0.6* exp(Normal(0, 0.2))
- translation: a pair of float or a distribution over it.
- Numbers between [-1 .. 1] will make sure the symbol stays withing the image i.e. the actual translation
depends on the remaining space after the symbol is scaled. Defaults to Uniform(-1, 1).
- inverse_color: bool or a distribution over it.
- If True, returns 1 - pixel_value to inverse the value of all pixels. Defaults to Uniform([True, False])
- max_contrast: bool or distribution over it.
- If True, pixel values will be rescaled to span 0..1 inside each image. Defaults to True.
- pixel_noise_scale: float or a distribution over it.
- The standard deviation of the pixel noise. Defaults to 0.01.
- resolution: A pair of integer.
- Defines the resolution of the image. Defaults to (32, 32).
- is_gray: bool.
- If True, the color channels are averaged into a single channel.
- Defaults to False.
- n_symbols: integer or a distribution over it.
- Number of symbols to rendered in the image.
- All arguments that are distributions will be sampled multiple
- times to provide different symbols. Defaults to 1.
- Note: if the number of symbols is variable, you will have
- to provide a proper mask_aggregator when calling
dataset_generator e.g. flatten_mask. :returns: A callable taking an optional seed as an argument and
returning an object of type drawing.Image.
-
synbols.generate.
dataset_generator
(attr_sampler, n_samples, mask_aggregator=None, dataset_seed=None)¶ High level function generating the dataset from an attribute sampler.
-
synbols.generate.
flatten_mask
(masks)¶
-
synbols.generate.
flatten_mask_except_first
(masks)¶
-
synbols.generate.
generate_and_write_dataset
(file_path, attr_sampler, n_samples, preview_shape=(10, 10), seed=None)¶ Call the attribute sampler n_samples time to generate a dataset and saves it on disk.
Parameters: file_path – the destination of the dataset an extension - .h5py will be automatically added.
- attr_sampler: a callable returning objects of type drawing.Image. n_samples: integer specifying the number of samples required. preview_shape: pair of integers or None.
- Specifies the size of the image grid to render a preview. The png
- will be saved alongside the dataset.
seed: integer or None. Specifies the seed the random number generator.
-
synbols.generate.
generate_char_grid
(language, n_char, n_font, seed=None, **kwargs)¶ Generate a dense grid of n_char x n_font. Mainly for visualization purpose.
-
synbols.generate.
make_preview
(generator, file_name, n_row=10, n_col=10)¶ Augment a generator to save a preview when the first n_row * n_col images are generated.
-
synbols.generate.
rand_seed
(rng)¶
-
synbols.generate.
text_generator
(char_list, seed=None, **kwargs)¶ Generate a string of synbols. Mainly for advertisement purpose
synbols.predefined_datasets module¶
-
synbols.predefined_datasets.
all_chars
(n_samples, seed=None, **kwarg)¶ Combines the symbols of all languages (up to 200 per languages). Note: some fonts may appear rarely.
-
synbols.predefined_datasets.
generate_balanced_font_chars_dataset
(n_samples, seed=None, **kwarg)¶ Samples uniformly from all fonts (max 200 per alphabet) or uniformly from all symbols (max 200 per alphabet) with probability 50%.
-
synbols.predefined_datasets.
generate_camouflage_dataset
(n_samples, language='english', texture='camouflage', seed=None, **kwarg)¶ Generate a dataset where the pixel distribution is the same for the foreground and background.
-
synbols.predefined_datasets.
generate_counting_dataset
(n_samples, language='english', resolution=(128, 128), n_symbols=None, scale_variation=0.5, seed=None, **kwarg)¶ Generate 3-10 symbols at various scale. Samples ‘a’ with prob 70% or a latin lowercase otherwise.
-
synbols.predefined_datasets.
generate_counting_dataset_crowded
(n_samples, seed=None, **kwargs)¶ Generate 30-50 symbols at fixed scale. Samples ‘a’ with prob 70% or a latin lowercase otherwise.
-
synbols.predefined_datasets.
generate_counting_dataset_scale_fix
(n_samples, seed=None, **kwargs)¶ Generate 3-10 symbols at fixed scale. Samples ‘a’ with prob 70% or a latin lowercase otherwise.
-
synbols.predefined_datasets.
generate_default_dataset
(n_samples, language='english', seed=None, **kwarg)¶ Generate the default dataset, using gradiant as foreground and background.
-
synbols.predefined_datasets.
generate_korean_1k_dataset
(n_samples, seed=None, **kwarg)¶ Uses the first 1000 korean symbols
-
synbols.predefined_datasets.
generate_large_translation
(n_samples, language='english', seed=None, **kwarg)¶ Synbols are translated beyond the border of the image to create a cropping effect. Scale is fixed to 0.5.
-
synbols.predefined_datasets.
generate_many_small_occlusions
(n_samples, language='english', seed=None, **kwarg)¶ Add small occlusions on all images. Number of occlusions are sampled uniformly in [0,5).
-
synbols.predefined_datasets.
generate_natural_images_dataset
(n_samples, language='english', seed=None, **kwargs)¶ Same as default dataset, but uses natural images as foreground and background.
-
synbols.predefined_datasets.
generate_non_camou_bw_dataset
(n_samples, language='english', seed=None, **kwargs)¶ Generate a black and white dataset with the same attribute distribution as the camouflage dataset.
-
synbols.predefined_datasets.
generate_non_camou_shade_dataset
(n_samples, language='english', seed=None, **kwargs)¶ Generate a gradient foreground and background dataset with same attribute distribution as the camouflage dataset.
-
synbols.predefined_datasets.
generate_pixel_noise
(n_samples, language='english', seed=None, **kwarg)¶ Add large pixel noise with probability 0.5.
-
synbols.predefined_datasets.
generate_plain_dataset
(n_samples, language='english', seed=None, **kwargs)¶ Generate white on black, centered symbols. The only factors of variations are font and char.
-
synbols.predefined_datasets.
generate_segmentation_dataset
(n_samples, language='english', resolution=(128, 128), seed=None, **kwarg)¶ Generate 3-10 symbols of various scale and rotation and translation (no bold).
-
synbols.predefined_datasets.
generate_solid_bg_dataset
(n_samples, language='english', seed=None, **kwarg)¶ Same as default datasets, but uses white on black.
-
synbols.predefined_datasets.
generate_some_large_occlusions
(n_samples, language='english', seed=None, **kwarg)¶ With probability 20%, add a large occlusion over the existing symbol.
-
synbols.predefined_datasets.
generate_tiny_dataset
(n_samples, language='english', seed=None, **kwarg)¶ Generate a dataset of 8x8 resolution in gray scale with scale of 1 and minimal variations.
-
synbols.predefined_datasets.
less_variations
(n_samples, language='english', seed=None, **kwarg)¶ Less variations in scale and rotations. Also, no bold and no italic. This makes a more accessible font classification task.
-
synbols.predefined_datasets.
missing_symbol_dataset
(n_samples, language='english', seed=None, **kwarg)¶ With 10% probability, no symbols are drawn
synbols.utils module¶
-
class
synbols.utils.
Alphabet
(name, fonts, symbols)¶ Bases:
object
Combines fonts and symbols for a given language.
-
class
synbols.utils.
Language
(locale_file, font_blacklist_dir)¶ Bases:
object
-
get_alphabet
(standard=True, auxiliary=True, lower=True, upper=False, support_bold=False, include_blacklisted_fonts=False)¶
-
-
synbols.utils.
flatten_attr
(attr, ctxt=None)¶
-
synbols.utils.
language_map_statistics
()¶
-
synbols.utils.
load_all_languages
(override_locale_path=None)¶ Loads all supported languages. Returns a dictionnary of Language objects indexed by their name.
-
synbols.utils.
make_img_grid
(x, y, h_axis='char', v_axis='font', n_row=20, n_col=40)¶
synbols.visualization module¶
-
synbols.visualization.
plot_dataset
(x, y, h_axis='char', v_axis='font', n_row=20, n_col=40, hide_axis=False)¶