Try Live
Add Docs
Rankings
Pricing
Docs
Install
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
Sciris
https://github.com/sciris/sciris
Admin
Sciris is a Python package for scientific computing, providing tools for modeling, simulation, and
...
Tokens:
66,940
Snippets:
559
Trust Score:
7.5
Update:
1 month ago
Context
Skills
Chat
Benchmark
89.4
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# Sciris Sciris is a powerful Python library for scientific computing that provides a collection of utilities to simplify common tasks in data analysis, file I/O, parallel processing, and visualization. Built on top of NumPy, Pandas, and Matplotlib, it offers intuitive interfaces for everyday operations like saving/loading objects, timing code, working with dates, and handling nested data structures. The library is designed to reduce boilerplate code and make scientific workflows more efficient and readable. The core philosophy of Sciris is to provide "batteries-included" functionality that covers the gaps between Python's standard library and scientific computing packages. It includes enhanced containers like `odict` (ordered dictionary with array-like indexing), flexible file I/O supporting pickles, JSON, and YAML, easy parallelization tools, robust date/time handling, and numerous helper functions for common operations. Whether you're building simulations, analyzing data, or developing scientific applications, Sciris aims to make your code cleaner and more maintainable. ## File I/O Functions ### sc.save() / sc.load() Save and load any Python object to/from disk using compressed pickle format. These functions handle serialization automatically with gzip or zstandard compression, making it easy to persist complex objects including custom classes, NumPy arrays, and nested data structures. ```python import sciris as sc import numpy as np # Save any Python object data = { 'name': 'experiment_1', 'results': np.random.rand(100, 50), 'metadata': {'date': '2024-01-15', 'version': 2.1} } sc.save('experiment.obj', data) # Load the object back loaded = sc.load('experiment.obj') print(loaded['name']) # 'experiment_1' print(loaded['results'].shape) # (100, 50) # Save with zstandard compression for better compression ratio sc.save('experiment.zst', data, compression='zstd') # Or use the zsave shortcut sc.zsave('experiment_zstd.obj', data) # Load works automatically with any compression format loaded_zstd = sc.load('experiment.zst') ``` ### sc.savejson() / sc.loadjson() Save and load JSON files with automatic handling of NumPy arrays, dates, and other non-JSON-serializable types. Provides a more robust alternative to the standard json module for scientific data. ```python import sciris as sc import numpy as np # Save data as JSON config = { 'parameters': {'alpha': 0.5, 'beta': 1.2}, 'array_data': np.array([1, 2, 3, 4, 5]), 'enabled': True } sc.savejson('config.json', config) # Load JSON with automatic type conversion loaded_config = sc.loadjson('config.json') print(loaded_config['parameters']['alpha']) # 0.5 # Pretty-print JSON to file sc.savejson('config_pretty.json', config, indent=2) ``` ### sc.loadtext() / sc.savetext() Convenience functions for reading and writing text files with minimal boilerplate. ```python import sciris as sc # Save text to file content = ['Line 1: Introduction', 'Line 2: Methods', 'Line 3: Results'] sc.savetext('document.txt', content) # Load text from file text = sc.loadtext('document.txt') print(text) # Full file contents as string # Load as list of lines lines = sc.loadtext('document.txt', splitlines=True) print(lines[0]) # 'Line 1: Introduction' ``` ### sc.thisdir() / sc.glob() Get the current directory path and find files using glob patterns. Essential utilities for file path handling in scripts. ```python import sciris as sc # Get the directory of the current script current_dir = sc.thisdir() print(current_dir) # e.g., '/home/user/project/scripts' # Get a file path relative to current script config_path = sc.thisdir('config', 'settings.json') # Find all Python files in a directory py_files = sc.glob('~/projects', '*.py', abspath=True) print(py_files) # List of absolute paths to .py files # Find files recursively all_csvs = sc.glob('.', '**/*.csv', recursive=True) # Find only files (not directories) data_files = sc.glob('./data', filesonly=True) ``` ## Ordered Dictionary (odict) ### sc.odict An enhanced ordered dictionary that supports integer indexing, slicing, and array-like operations. Combines the best features of dictionaries, lists, and arrays into a single flexible container. ```python import sciris as sc import numpy as np # Create an odict data = sc.odict( temperatures=[20.1, 21.3, 22.5, 23.1], pressures=[101.2, 101.5, 101.8, 102.0], humidity=[45, 50, 55, 60] ) # Access by key (like a dict) print(data['temperatures']) # [20.1, 21.3, 22.5, 23.1] # Access by integer index (like a list) print(data[0]) # [20.1, 21.3, 22.5, 23.1] print(data[1]) # [101.2, 101.5, 101.8, 102.0] # Slice access returns numpy array print(data[:2]) # array with first two values # Iterate with index, key, and value for i, key, value in data.enumitems(): print(f'{i}: {key} = {value}') # Use as defaultdict nested = sc.odict(defaultdict=list) nested['new_key'].append('auto-created') # Infinitely nested dictionary deep = sc.odict(defaultdict='nested') deep['level1']['level2']['level3'] = 'value' print(deep['level1']['level2']['level3']) # 'value' ``` ### sc.objdict Like odict, but allows attribute-style access to dictionary keys. Perfect for configuration objects and structured data. ```python import sciris as sc # Create an objdict config = sc.objdict( model='ResNet50', learning_rate=0.001, batch_size=32, epochs=100 ) # Access via attribute (cleaner syntax) print(config.model) # 'ResNet50' print(config.learning_rate) # 0.001 # Still works like a dictionary config['dropout'] = 0.5 print(config.dropout) # 0.5 # Iterate like a dictionary for key, value in config.items(): print(f'{key}: {value}') # Convert from regular dict settings = sc.objdict({'name': 'experiment', 'seed': 42}) print(settings.name) # 'experiment' ``` ## Date and Time Functions ### sc.now() / sc.date() Get current time and convert various date formats to Python datetime objects. Provides flexible date parsing and formatting. ```python import sciris as sc # Get current time current_time = sc.now() print(current_time) # datetime object: 2024-01-15 14:30:45 # Get time as string time_str = sc.now(astype='str') print(time_str) # '2024-Jan-15 14:30:45' # Get time in specific timezone utc_time = sc.now(utc=True) pacific = sc.now(timezone='US/Pacific') # Convert string to date date1 = sc.date('2024-03-15') print(date1) # datetime.date(2024, 3, 15) # Convert multiple dates at once dates = sc.date(['2024-01-01', '2024-06-15', '2024-12-31']) # Convert integer offset to date day_10 = sc.date(10, start_date='2024-01-01') print(day_10) # datetime.date(2024, 1, 11) # Output as string date_str = sc.date('2024-03-15', to='str', outformat='%Y/%m/%d') print(date_str) # '2024/03/15' ``` ### sc.daterange() / sc.datedelta() Generate date ranges and perform date arithmetic. Useful for time series analysis and scheduling. ```python import sciris as sc # Generate a range of dates dates = sc.daterange('2024-01-01', '2024-01-10') print(dates) # ['2024-01-01', '2024-01-02', ..., '2024-01-10'] # Generate dates with interval monthly = sc.daterange('2024-01-01', '2024-12-01', interval='month') print(monthly) # First day of each month # Generate dates using delta dates_5weeks = sc.daterange('2024-01-01', weeks=5) # Perform date arithmetic future = sc.datedelta('2024-01-15', days=30) print(future) # '2024-02-14' past = sc.datedelta('2024-06-15', months=-3) print(past) # '2024-03-15' # Add multiple units new_date = sc.datedelta('2024-01-01', years=1, months=2, days=15) print(new_date) # '2025-03-16' # Calculate difference between dates diff = sc.daydiff('2024-01-01', '2024-03-15') print(diff) # 74 days ``` ### sc.tic() / sc.toc() / sc.timer() Simple and intuitive timing functions for measuring code execution time. Much cleaner than using the time module directly. ```python import sciris as sc import numpy as np # Simple timing with tic/toc sc.tic() result = np.random.rand(1000, 1000) @ np.random.rand(1000, 1000) sc.toc() # Prints: Elapsed time: 0.234 s # Named timing sc.tic() # ... some operation ... elapsed = sc.toc(output=True) # Returns elapsed time without printing # Timer context manager (recommended) with sc.timer('Matrix multiplication'): result = np.random.rand(1000, 1000) @ np.random.rand(1000, 1000) # Output: Matrix multiplication: 0.234 s # Timer object for multiple measurements T = sc.timer() T.tic() np.sort(np.random.rand(1_000_000)) T.toc('Sorting 1M elements') T.tic() np.fft.fft(np.random.rand(1_000_000)) T.toc('FFT of 1M elements') # Get timing summary print(T) # Shows all recorded times print(f'Total time: {T.total} seconds') ``` ## Parallelization ### sc.parallelize() Easy parallelization of functions across multiple CPU cores. Abstracts away the complexity of multiprocessing with a simple interface. ```python import sciris as sc import numpy as np # Define a function to parallelize def process_data(x, multiplier=1): sc.randsleep() # Simulate variable processing time return x ** 2 * multiplier # Parallelize over a range of values results = sc.parallelize(process_data, iterarg=range(10)) print(results) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] # Parallelize with additional arguments results = sc.parallelize( process_data, iterarg=range(10), kwargs={'multiplier': 2} ) print(results) # [0, 2, 8, 18, 32, 50, 72, 98, 128, 162] # Use iterkwargs for varying keyword arguments results = sc.parallelize( process_data, iterkwargs=[ {'x': 1, 'multiplier': 1}, {'x': 2, 'multiplier': 2}, {'x': 3, 'multiplier': 3}, ] ) print(results) # [1, 8, 27] # Control number of CPUs results = sc.parallelize(process_data, iterarg=range(20), ncpus=4) # Show progress bar results = sc.parallelize(process_data, iterarg=range(100), progress=True) # Run serially for debugging results = sc.parallelize(process_data, iterarg=range(10), serial=True) ``` ### sc.Parallel (Advanced Usage) For more control over parallel execution, use the Parallel class directly with async monitoring. ```python import sciris as sc def slow_computation(i): sc.randsleep(seed=i) return i ** 2 # Create parallel manager P = sc.Parallel( slow_computation, iterarg=range(20), parallelizer='multiprocess-async', ncpus=4 ) # Start async execution P.run_async() # Monitor progress P.monitor() # Displays progress bar # Get results P.finalize() print(P.results) # [0, 1, 4, 9, 16, ...] print(P.times) # Timing information for each job ``` ## Dataframe Extensions ### sc.dataframe An extended pandas DataFrame with additional convenience methods for flexible row/column access, data manipulation, and easier syntax for common operations. ```python import sciris as sc import numpy as np # Create a dataframe df = sc.dataframe( a=[1, 2, 3, 4, 5], b=[10, 20, 30, 40, 50], c=['x', 'y', 'z', 'w', 'v'] ) # Access by column name (standard) print(df['a']) # Column 'a' # Access by row index (integer) print(df[0]) # First row print(df[1:3]) # Rows 1 and 2 # Access by row and column print(df['a', 2]) # Value at column 'a', row 2 (result: 3) print(df[2, 'b']) # Same result, different order # Slice operations df[0, :] = [100, 1000, 'new'] # Set entire row print(df) # Add new column df.addcol('d', [1.1, 2.2, 3.3, 4.4, 5.5]) # Remove column df.rmcol('d') # Append row df.append([6, 60, 'u']) # Insert row at position df.insertrow(2, [2.5, 25, 'y.5']) # Remove row by value df.rmrow(100) # Remove row where first column equals 100 # Sort by column df.sort('b', reverse=True) # Find row by value row = df.findrow(3) # Find row where first column equals 3 print(row) ``` ## Array and Math Utilities ### sc.findinds() / sc.findnearest() Find array indices matching conditions or nearest values. More flexible than NumPy's native functions. ```python import sciris as sc import numpy as np data = np.array([1.0, 2.5, 3.0, 2.5, 5.0, 2.5, 7.0]) # Find indices where value equals 2.5 (with floating point tolerance) indices = sc.findinds(data, 2.5) print(indices) # array([1, 3, 5]) # Find first matching index first = sc.findinds(data, 2.5, first=True) print(first) # 1 # Find last matching index last = sc.findinds(data, 2.5, last=True) print(last) # 5 # Find indices matching a condition small = sc.findinds(data < 3) print(small) # array([0, 1, 3, 5]) # Multiple conditions combined = sc.findinds(data > 2, data < 6) print(combined) # array([1, 2, 3, 4, 5]) # Find nearest value in array series = np.array([0, 10, 20, 30, 40, 50]) nearest_idx = sc.findnearest(series, 23) print(nearest_idx) # 2 (value 20 is nearest to 23) # Find nearest for multiple values indices = sc.findnearest(series, [15, 35, 47]) print(indices) # array([1, 3, 5]) ``` ### sc.smooth() / sc.rolling() Apply smoothing and rolling operations to data arrays for noise reduction and trend analysis. ```python import sciris as sc import numpy as np # Create noisy data np.random.seed(42) noisy = np.sin(np.linspace(0, 4*np.pi, 100)) + np.random.randn(100) * 0.3 # Simple smoothing smoothed = sc.smooth(noisy, window=5) # Rolling average rolling_avg = sc.rolling(noisy, window=10, operation='mean') # Rolling with different operations rolling_sum = sc.rolling(noisy, window=10, operation='sum') rolling_std = sc.rolling(noisy, window=10, operation='std') # 2D smoothing data_2d = np.random.rand(50, 50) smoothed_2d = sc.smooth(data_2d, window=3) ``` ## Printing and Display ### sc.pr() / sc.prettyobj Pretty-print detailed representations of objects including methods, properties, and attributes. Invaluable for debugging and exploration. ```python import sciris as sc import numpy as np # Pretty-print any object to see its structure df = sc.dataframe(a=[1,2,3], b=[4,5,6]) sc.pr(df) # Shows all methods, properties, and attributes # Create objects with automatic pretty printing class MyModel(sc.prettyobj): def __init__(self): self.weights = np.random.rand(10, 10) self.bias = np.zeros(10) self.learning_rate = 0.001 def train(self): pass def predict(self, x): return x @ self.weights + self.bias model = MyModel() print(model) # Automatically shows all attributes and methods # For large objects, use quickobj (doesn't print values) class BigModel(sc.quickobj): def __init__(self): self.big_array = np.random.rand(1000, 1000) big = BigModel() print(big) # Shows structure without printing large arrays ``` ### sc.sigfig() / sc.heading() Format numbers with significant figures and print formatted headings for readable output. ```python import sciris as sc import numpy as np # Format number with significant figures value = 3432.3842 print(sc.sigfig(value, sigfigs=3)) # '3430' print(sc.sigfig(value, sigfigs=5)) # '3432.4' # Use SI notation print(sc.sigfig(1234567, SI=True)) # '1.235M' print(sc.sigfig(0.00234, sigfigs=2)) # '0.0023' # Format array of numbers values = np.array([1234, 5678, 91011]) formatted = sc.sigfig(values, sigfigs=2) print(formatted) # ['1200', '5700', '91000'] # Print formatted headings sc.heading('Results Summary') # Output: # ===================================== # Results Summary # ===================================== sc.heading('Section 1', level=2) # Output: # --- Section 1 --- # Print with color sc.heading('Important Notice', color='red') ``` ### sc.progressbar() Display text-based progress bars for long-running operations without additional dependencies. ```python import sciris as sc import time # Simple progress bar in a loop n = 100 for i in range(n): sc.progressbar(i+1, n) time.sleep(0.01) # Simulate work # With custom label for i in range(50): sc.progressbar(i+1, 50, label='Processing files') time.sleep(0.02) # Manual control sc.progressbar(25, 100, label='Download progress') # Shows 25% ``` ## Nested Data Operations ### sc.getnested() / sc.setnested() Access and modify deeply nested data structures using key lists. Essential for working with complex JSON configurations and hierarchical data. ```python import sciris as sc # Create nested data data = { 'experiment': { 'config': { 'model': { 'name': 'transformer', 'layers': 12, 'hidden_size': 768 }, 'training': { 'epochs': 100, 'batch_size': 32 } }, 'results': { 'accuracy': 0.95 } } } # Get nested value using key list model_name = sc.getnested(data, ['experiment', 'config', 'model', 'name']) print(model_name) # 'transformer' # Set nested value sc.setnested(data, ['experiment', 'config', 'training', 'epochs'], 200) print(data['experiment']['config']['training']['epochs']) # 200 # Create nested structure automatically new_data = {} sc.makenested(new_data, ['level1', 'level2', 'level3'], value='deep value') print(new_data) # {'level1': {'level2': {'level3': 'deep value'}}} # Iterate over all nested keys for keylist in sc.iternested(data): value = sc.getnested(data, keylist) print(f'{" > ".join(keylist)}: {value}') ``` ### sc.search() Search through nested objects to find values, keys, or patterns. Powerful for exploring complex data structures. ```python import sciris as sc # Complex nested object obj = sc.objdict( users=[ {'name': 'Alice', 'email': 'alice@example.com', 'age': 30}, {'name': 'Bob', 'email': 'bob@example.com', 'age': 25}, ], settings={'theme': 'dark', 'notifications': True}, metadata={'version': '1.0', 'name': 'MyApp'} ) # Search for a value results = sc.search(obj, 'Alice') print(results) # Shows paths where 'Alice' was found # Search for a key results = sc.search(obj, 'email', method='key') print(results) # Shows all paths containing 'email' key # Search with regex pattern results = sc.search(obj, r'.*@example\.com', method='value') ``` ## Memory and Profiling ### sc.checkmem() / sc.checkram() Check memory usage of objects and current RAM consumption. Essential for optimizing memory-intensive applications. ```python import sciris as sc import numpy as np # Check memory usage of an object big_array = np.random.rand(1000, 1000) sc.checkmem(big_array) # Output: DataFrame showing memory usage # Check memory of nested structure data = { 'small': np.random.rand(10, 10), 'medium': np.random.rand(100, 100), 'large': np.random.rand(500, 500) } sc.checkmem(data, descend=1) # Shows memory breakdown for each key # Check current RAM usage start_ram = sc.checkram(to_string=False) large_data = np.random.rand(10000, 1000) print(sc.checkram(start=start_ram)) # Shows RAM increase ``` ### sc.benchmark() Quickly benchmark your system's Python and NumPy performance. ```python import sciris as sc # Run standard benchmark results = sc.benchmark() print(results) # {'python': 11.2, 'numpy': 245.3} (MOPS) # Benchmark only NumPy numpy_mops = sc.benchmark(which='numpy') if numpy_mops > 300: print('Fast system!') elif numpy_mops < 100: print('Slow system') else: print('Average system') # More detailed benchmarking results = sc.benchmark(repeats=10, verbose=True) ``` ## 3D Plotting ### sc.plot3d() / sc.scatter3d() Create 3D visualizations with minimal code. Simplifies matplotlib's 3D plotting interface. ```python import sciris as sc import numpy as np import matplotlib.pyplot as plt # Create 3D line plot t = np.linspace(0, 10*np.pi, 1000) x = np.sin(t) y = np.cos(t) z = t sc.plot3d(x, y, z, c='index') plt.title('3D Helix') plt.show() # 3D scatter plot n = 500 x = np.random.randn(n) y = np.random.randn(n) z = x**2 + y**2 sc.scatter3d(x, y, z, c=z, cmap='viridis', s=20) plt.title('3D Scatter with Color') plt.show() # Surface plot x = np.linspace(-5, 5, 50) y = np.linspace(-5, 5, 50) X, Y = np.meshgrid(x, y) Z = np.sin(np.sqrt(X**2 + Y**2)) sc.surf3d(Z, cmap='coolwarm') plt.title('Surface Plot') plt.show() ``` ## Summary Sciris excels at reducing the friction in everyday scientific computing tasks. Its primary use cases include: rapid prototyping of data analysis pipelines where you need flexible containers and quick file I/O; building scientific simulations that require parallel processing and timing instrumentation; developing applications that work with complex nested configurations and hierarchical data; and creating reproducible research workflows with robust object serialization. The library is particularly valuable when you find yourself repeatedly writing boilerplate code for common operations like timing, file handling, or date manipulation. Integration with existing codebases is straightforward since Sciris builds on standard libraries (NumPy, Pandas, Matplotlib) rather than replacing them. You can adopt individual functions as needed without committing to the entire library. Common patterns include using `sc.objdict` for configuration management, `sc.save/load` for checkpointing long-running computations, `sc.parallelize` for embarrassingly parallel workloads, and `sc.timer` for performance monitoring. The library's design philosophy emphasizes sensible defaults while allowing full customization when needed, making it suitable for both quick scripts and production applications.