Skip to content

Conversation

@nikhil-sarin
Copy link
Owner

No description provided.

…election effects

This commit implements a comprehensive population synthesis framework for redback:

New Classes:
- PopulationSynthesizer: Advanced population synthesis with:
  * Volumetric rate evolution with redshift (constant, powerlaw, SFR-like, custom)
  * Cosmology-aware redshift sampling (Planck18 default)
  * Realistic parameter sampling from priors
  * Optional selection effects and detection efficiency
  * Rate inference from observed samples accounting for selection biases
  * Survey-agnostic design (works without pointing databases)

- TransientPopulation: Container class for populations with:
  * Parameter storage and metadata tracking
  * Analysis methods (redshift/parameter distributions, detection fractions)
  * Save/load functionality with JSON metadata
  * Easy access to detected vs all transients

Key Features:
1. Rate Evolution Models:
   - Constant: R(z) = R0
   - Powerlaw: R(z) = R0 * (1+z)^α
   - SFR-like: Following Madau & Dickinson 2014
   - Custom: User-defined rate functions

2. Redshift Sampling:
   - Properly accounts for differential comoving volume
   - Includes time dilation factor
   - Weighted by rate evolution

3. Selection Effects:
   - Magnitude-limited detection probability
   - Survey-specific efficiency functions
   - Malmquist bias handling

4. Rate Inference:
   - Maximum likelihood estimation
   - Accounts for detection efficiency vs redshift
   - Poisson uncertainties

5. Integration with Redback:
   - Works with all redback models via model_library
   - Uses bilby priors
   - Compatible with existing simulation infrastructure

Example Usage:
```python
from redback.simulate_transients import PopulationSynthesizer

synth = PopulationSynthesizer(
    model='one_component_kilonova_model',
    rate=1e-6,  # Gpc^-3 yr^-1
    rate_evolution='sfr_like',
    cosmology='Planck18'
)

population = synth.simulate_population(
    n_years=10,
    include_selection_effects=True,
    survey_config={'limiting_mag': 24.5, 'bands': ['lsstr']}
)

# Analyze
print(f"Detection fraction: {population.detection_fraction:.2%}")
population.save('my_population.csv')

# Infer rate
results = synth.infer_rate(population.detected)
print(f"Inferred rate: {results['rate_ml']:.2e}")
```

The implementation includes comprehensive docstrings and a detailed
example script (examples/simulate_population_synthesis.py) demonstrating
all major features.
…ation of concerns

This commit restructures the PopulationSynthesizer to provide a cleaner, more
flexible architecture that separates parameter generation from post-processing.

## New Methods:

### 1. generate_population() - Pure Parameter Generation
   - Returns DataFrame with model parameters, redshifts, RA/DEC, t0, distances
   - No detection modeling or light curve generation
   - Can be passed directly to ANY simulation tool:
     * redback.SimulateOpticalTransient
     * Custom simulation codes
     * External tools

   Parameters included:
   - All model parameters (from priors)
   - redshift (from volumetric rate + cosmology)
   - ra, dec (isotropic sky positions)
   - t0_mjd_transient (Poisson-distributed event times)
   - luminosity_distance (from cosmology)

### 2. apply_detection_criteria() - Flexible Post-Processing
   - Takes DataFrame and custom detection function
   - Detection function can return:
     * bool (detected or not)
     * float [0,1] (detection probability, stochastically applied)
   - Adds 'detected' and 'detection_probability' columns
   - Fully customizable detection logic

   Examples:
   - Simple cuts: lambda row: row['redshift'] < 0.5
   - Probabilistic: lambda row: 1.0/(1 + exp(z/0.3))
   - Complex multi-criteria functions

### 3. Helper Methods:
   - `_sample_sky_positions(n)` → isotropic (RA, DEC)
   - `_sample_event_times(n, range)` → Poisson event times
   - `_calculate_expected_events(years, z_max)` → expected count from rate

### 4. simulate_population() - Convenience Wrapper
   - Now calls generate_population() + apply_detection_criteria()
   - Maintains backward compatibility
   - For most use cases, individual methods provide more control

## Workflow Examples:

### Basic: Pure parameter generation
```python
synth = PopulationSynthesizer(
    model='one_component_kilonova_model',
    rate=1e-6, rate_evolution='powerlaw'
)

# Just get parameters
params = synth.generate_population(n_years=10, z_max=1.0)

# Use with any tool
sim = SimulateOpticalTransient.simulate_transient_population_in_rubin(
    model='one_component_kilonova_model',
    parameters=params.to_dict('list')
)
```

### Advanced: Custom detection + rate inference
```python
# Generate population
params = synth.generate_population(n_events=500, z_max=2.0)

# Apply custom detection
def my_detection(row, **kwargs):
    # Your logic here
    return row['redshift'] < 0.5 and row['mej'] > 0.01

detected_params = synth.apply_detection_criteria(
    params,
    detection_function=my_detection
)

# Infer rate from detected sample
rate_result = synth.infer_rate(
    detected_params[detected_params['detected']==True],
    efficiency_function=lambda z: ...
)
```

## Key Improvements:

1. **Separation of Concerns**: Parameter generation, observation simulation,
   and detection modeling are now independent steps

2. **Flexibility**: Users can:
   - Generate parameters once, apply multiple detection criteria
   - Use parameters with different simulation tools
   - Easily implement custom detection logic
   - Post-process results independently

3. **Integration**: Seamless integration with existing redback tools while
   allowing use of external tools

4. **Clarity**: Each method has a single, well-defined purpose

## New Example Script:

examples/simulate_population_modular.py demonstrates:
- Pure parameter generation
- Integration with SimulateOpticalTransient
- Custom detection functions (simple, probabilistic, complex)
- Rate inference workflow
- Detection efficiency analysis
- Evolving rate populations
- Saving/loading parameters for later use

## Backward Compatibility:

The existing simulate_population() method still works identically,
now implemented as a thin wrapper around the new methods.
The _calculate_expected_events method was missing the (1+z)^-1 factor
for cosmological time dilation. This caused it to overestimate the
expected number of events, especially for populations at higher redshift.

Bug Details:
- _sample_redshifts correctly included (1+z)^-1 factor
- _calculate_expected_events was missing this factor
- This inconsistency led to incorrect Poisson draws

The correct formula for observed event rate is:
  dN/dt_obs = ∫ R(z) * dVc/dz * dz/(1+z)

The (1+z)^-1 accounts for cosmological time dilation - events at higher
redshift appear to occur at slower rates in the observer frame.

Impact:
- Without fix: overestimates event counts
- Effect larger for:
  * Higher redshift populations
  * Evolving rate models (powerlaw, sfr_like)
  * Longer observation times

Example impact at z=1: events occur 2x slower in observer frame
Example impact at z=2: events occur 3x slower in observer frame

Fix verified against standard cosmological Poisson process formulation.
@nikhil-sarin nikhil-sarin added the enhancement New feature or request label Nov 8, 2025
@nikhil-sarin nikhil-sarin marked this pull request as draft November 8, 2025 15:14
nikhil-sarin and others added 4 commits November 8, 2025 15:16
…e modeling

This commit adds the ability to generate N events uniformly in comoving volume
and time, without requiring rate-based calculations. Useful for forecasting,
testing, and survey planning where you don't want to commit to a specific rate model.

New Parameter:
- `rate_weighted_redshifts` (bool, default=True) in generate_population()

When rate_weighted_redshifts=True (default):
- Redshifts sampled as: R(z) * dVc/dz / (1+z)
- Accounts for volumetric rate evolution
- Number of events from Poisson draw (if n_events not specified)
- Use for: rate inference, realistic population modeling

When rate_weighted_redshifts=False:
- Redshifts sampled directly from prior (e.g., UniformComovingVolume)
- MUST specify n_events explicitly
- Events placed uniformly in time window
- Use for: forecasting, testing, survey planning

Example Usage with UniformComovingVolume:
```python
from bilby.gw.prior import UniformComovingVolume
import redback

# Set up prior with uniform-in-volume redshifts
prior = redback.priors.get_priors('one_component_kilonova_model')
prior['redshift'] = UniformComovingVolume(
    minimum=0.001,
    maximum=0.5,
    name='redshift',
    cosmology='Planck18'
)

synth = PopulationSynthesizer(
    model='one_component_kilonova_model',
    prior=prior
)

# Generate 100 events uniformly in volume and time
params = synth.generate_population(
    n_events=100,
    time_range=(60000, 60365.25),
    rate_weighted_redshifts=False  # Simple uniform sampling
)

# Use with any simulation tool
simulator = SimulateOpticalTransient.simulate_transient_population(
    model='one_component_kilonova_model',
    parameters=params.to_dict('list'),
    ...
)
```

Benefits:
- Quick forecasts without rate modeling complexity
- Test parameter distributions without committing to rates
- Survey planning: "What if I see N events?"
- Educational: simpler mental model for beginners
- Compatible with bilby's UniformComovingVolume prior

New Example Script:
- examples/simple_population_forecast.py
  * 5 examples showing uniform sampling
  * Comparison with rate-weighted sampling
  * Survey planning use case
  * Integration with existing tools
  * Error handling demonstrations

Backward Compatibility:
- Default behavior unchanged (rate_weighted_redshifts=True)
- All existing code works identically
- New option is opt-in
…R cuts

This commit adds a new simulation interface that bridges the gap between
SimulateGenericTransient (too simple) and SimulateOpticalTransient
(requires full pointing database).

New Class: SimulateTransientWithCadence

Perfect for:
- Using PopulationSynthesizer parameters
- Survey planning without pointing databases
- Testing different cadence strategies
- Simple SNR-based detection modeling

Key Features:

1. Flexible Cadence Specification:
   - Uniform cadence: cadence_days=2.0 (all bands every 2 days)
   - Per-band cadence: cadence_days={'g': 3, 'r': 1, 'i': 5}
   - Alternating bands: band_sequence=['g', 'r', 'i']
   - Delayed start: start_offset_days=5

2. SNR-Based Detection:
   - Realistic noise from limiting magnitudes
   - SNR calculated per observation
   - Configurable SNR threshold (default: 5)
   - Marks detected/non-detected observations

3. Multiple Noise Models:
   - 'limiting_mag': From 5-sigma limiting magnitude (default, most realistic)
   - 'gaussian': Fixed Gaussian noise
   - 'gaussianmodel': Noise proportional to model flux

4. Integration:
   - Works seamlessly with PopulationSynthesizer.generate_population()
   - Output compatible with redback analysis tools
   - Easy to iterate over populations

Example Usage:

```python
from redback.simulate_transients import (
    PopulationSynthesizer, SimulateTransientWithCadence
)

# Generate event with PopulationSynthesizer
synth = PopulationSynthesizer(model='one_component_kilonova_model', rate=1e-6)
params = synth.generate_population(n_events=1)
event = params.iloc[0].to_dict()

# Define observation cadence
cadence_config = {
    'bands': ['g', 'r', 'i'],
    'cadence_days': 2.0,  # Every 2 days
    'duration_days': 30,
    'limiting_mags': {'g': 22.5, 'r': 23.0, 'i': 22.5}
}

# Simulate observations
sim = SimulateTransientWithCadence(
    model='one_component_kilonova_model',
    parameters=event,
    cadence_config=cadence_config,
    snr_threshold=5,
    noise_type='limiting_mag'
)

# Access results
all_obs = sim.observations  # All scheduled observations
detected = sim.detected_observations  # Only SNR >= threshold

print(f"Detected {len(detected)}/{len(all_obs)} observations")
```

Advanced Examples (Per-Band Cadence):

```python
# Different cadence per band (like real surveys)
cadence_config = {
    'bands': ['g', 'r', 'i'],
    'cadence_days': {'g': 3, 'r': 1, 'i': 5},  # r-band every day
    'duration_days': 50,
    'limiting_mags': {'g': 22.5, 'r': 23.0, 'i': 22.0}
}
```

Alternating Band Sequence:

```python
# Observe in g-r-i-g-r-i pattern
cadence_config = {
    'bands': ['g', 'r', 'i'],
    'cadence_days': 1.0,
    'duration_days': 20,
    'limiting_mags': {'g': 22.5, 'r': 23.0, 'i': 22.5},
    'band_sequence': ['g', 'r', 'i']  # Repeating pattern
}
```

Population Workflow:

```python
# Generate population
params = synth.generate_population(n_events=100)

# Simulate each event
detected_events = []
for idx in range(len(params)):
    sim = SimulateTransientWithCadence(
        model='kilonova',
        parameters=params.iloc[idx].to_dict(),
        cadence_config=survey_cadence,
        snr_threshold=5
    )
    if len(sim.detected_observations) > 0:
        detected_events.append(sim)

print(f"Detection rate: {len(detected_events)/len(params):.1%}")
```

Output DataFrame Columns:
- time_since_t0: Days since explosion
- time_mjd: Observation time in MJD
- band: Filter name
- model_magnitude: True model magnitude
- magnitude: Observed magnitude (with noise)
- magnitude_error: Magnitude uncertainty
- flux: Observed flux
- flux_error: Flux uncertainty
- snr: Signal-to-noise ratio
- detected: Boolean, True if SNR >= threshold
- limiting_mag: Survey limiting magnitude

Methods:
- observations: All scheduled observations
- detected_observations: Only detected (property)
- save_transient(name): Save to CSV files

New Example Script:
examples/simulate_with_cadence.py demonstrates:
1. Basic usage with PopulationSynthesizer
2. Different cadences per band
3. Alternating band sequences
4. Population simulation
5. Testing SNR thresholds
6. Delayed observation start
7. Saving/loading results
8. Complete workflow integration

Use Cases:
- Quick survey planning
- Cadence optimization
- Detection efficiency studies
- Forecasting without full survey simulation
- Testing analysis pipelines
…cumentation

Extended SimulateTransientWithCadence to support multi-wavelength observations:
- Added observation_mode parameter ('optical', 'radio', 'xray')
- Radio: frequencies (Hz), flux densities (Jy), sensitivity-based SNR cuts
- X-ray: frequencies (Hz), flux/energy densities, sensitivity limits
- Auto-detection of mode from cadence config keys
- Separate methods for radio/X-ray generation and noise modeling

New example scripts:
- examples/simulate_radio_xray_cadence.py: Comprehensive radio/X-ray cadence examples
  (GRB afterglows, kilonova radio, TDE X-ray, multi-frequency strategies)
- examples/simulate_radio_xray.py: Radio/X-ray with SimulateGenericTransient

Documentation updates (docs/simulation.txt):
- Added comprehensive PopulationSynthesizer and TransientPopulation documentation
  with key features, usage examples, and technical details
- Added SimulateTransientWithCadence documentation with optical/radio/X-ray examples
- Added Integration Examples section showing how classes work together
- Reorganized Examples section with categories (Basic, Population, Cadence)
- Documented all new example scripts with descriptions
@nikhil-sarin nikhil-sarin force-pushed the claude/enhance-population-synthesizer-011CUuaD7bTeRRZsoWvnzast branch from 502f492 to f2cbc1e Compare November 15, 2025 01:51
claude and others added 12 commits November 15, 2025 02:18
Implemented new class for simulating gamma-ray observations with proper
photon counting statistics, addressing the need for high-energy transient
simulations (GRBs, magnetars, X-ray transients).

New SimulateGammaRayTransient class features:
- Time-tagged events (TTE): Individual photon arrival times and energies
  using thinning algorithm for non-homogeneous Poisson processes
- Binned photon counts: Count rates in time bins and energy channels
  with proper Poisson statistics
- Energy-dependent detector response: Flexible effective area specification
  (constant, dictionary mapping, or callable function)
- Background modeling: Constant or energy-dependent background rates
- Source/background separation: Track which photons are from source
- High time resolution: Sub-millisecond timing capability
- Save/load functionality for both TTE and binned data

Key methods:
- generate_time_tagged_events(): Individual photon events with energies
- generate_binned_counts(): Binned light curves (energy-integrated or per-channel)
- save_time_tagged_events(): Save TTE with metadata
- save_binned_counts(): Save binned data

Use cases:
- GRB prompt emission (Fermi/GBM, Swift/BAT)
- Magnetar bursts and flares
- X-ray/gamma-ray afterglows
- Pulsar timing and QPO searches
- Fast transient characterization
- Parameter recovery studies

New example script:
- examples/simulate_gamma_ray.py: Comprehensive demonstrations including
  GRB prompt emission, magnetar bursts, timing analysis, variable binning
  strategies, detector response functions, and typical instrument configs
  (Fermi/GBM, Swift/BAT, INTEGRAL/SPI, NuSTAR)

Documentation updates (docs/simulation.txt):
- Added SimulateGammaRayTransient to overview list
- New dedicated section with features, examples, and use cases
- Energy-dependent response examples
- Time-tagged events and binned counts workflows
- Typical detector configurations
- Saving/loading examples
- Added gamma-ray example to Examples section

Technical details:
- Thinning algorithm (Lewis & Shedler 1979) for sampling photon arrival times
- Proper conversion from flux density (Jy) to photon flux (photons/cm^2/s/keV)
- Energy-dependent effective area via interpolation or callable
- Poisson statistics for photon counting
- Background and source contributions properly tracked
Added test suites for all new simulation classes with over 90 test cases:

TestPopulationSynthesizer (18 tests):
- Initialization with different rate evolution models (constant, powerlaw, sfr_like, custom)
- Population generation with fixed N and rate-weighted options
- Sky position generation (RA/DEC bounds checking)
- Distance calculations (luminosity, comoving)
- Event time generation within MJD ranges
- Detection criteria application (simple boolean, probabilistic)
- Rate inference from observed samples
- Redshift sampling within bounds

TestTransientPopulation (8 tests):
- Container initialization with metadata
- Property access (redshifts, sky_positions, detection_fraction)
- Summary statistics calculation
- Filtering by redshift
- Save functionality with/without metadata
- Edge cases (missing detected column)

TestSimulateTransientWithCadence (11 tests):
- Optical observation initialization
- Radio observation initialization
- Auto-detection of observation mode from config keys
- DataFrame structure verification (bands, frequencies)
- Detection filtering by SNR threshold
- Per-band/frequency cadence configuration
- Delayed observation start
- SNR threshold effect on detections
- Save functionality

TestSimulateGammaRayTransient (20 tests):
- Initialization and energy bin setup
- Energy center geometric mean calculation
- Effective area setup (constant, dict, callable)
- Background rate configuration
- Binned counts structure (per-channel, energy-integrated)
- Counts positivity and error calculation
- Time-tagged events structure
- Event sorting by time
- Events within time/energy bounds
- Save functionality for TTE and binned data
- Error handling for missing data

Test coverage includes:
- Initialization and configuration
- Core functionality verification
- Edge cases and error handling
- Data structure validation
- File I/O operations (mocked)
- Statistical properties
- Parameter bounds checking
Key fixes:
- Auto-convert noise_type for radio/X-ray modes (limiting_mag -> sensitivity)
- Add missing TransientPopulation methods: sky_positions, summary_stats, filter_by_redshift
- Update cosmology loading to use modern astropy API (getattr instead of get_cosmology_from_string)
- Convert bilby prior samples dict to DataFrame
- Add comoving_distance calculation alongside luminosity_distance
- Handle numpy boolean types in detection_function validation
- Update test mocks to use bilby.utils.check_directory_exists_and_if_not_mkdir

These fixes resolve all test failures for PopulationSynthesizer, TransientPopulation,
SimulateTransientWithCadence, and SimulateGammaRayTransient classes.
Added 51 new tests covering:
- TransientPopulation: __len__, __repr__, detected property, get_redshift_distribution,
  get_parameter_distribution, load classmethod, sky_positions edge cases, filter methods
- SimulateTransientWithCadence: X-ray mode, gaussian/gaussianmodel noise types,
  per-frequency cadence, error handling, SNR cut options, t0 handling
- PopulationSynthesizer: simulate_population, calculate_expected_events, custom rate
  functions, time ranges, detection probability floats, efficiency functions
- SimulateGammaRayTransient: callable/dict background rates, edge cases (short time
  ranges, high background, single/many channels), binned counts validation

Total tests increased from 69 to 120 (117 passing, 3 pre-existing network errors).
This significantly improves code coverage for all new simulation classes.
Added 21 new tests targeting error paths and edge cases:
- Invalid model types/strings in PopulationSynthesizer
- Invalid prior types and missing prior for custom models
- Unknown rate evolution models
- Missing redshift columns in filter/distribution methods
- Invalid detection function return types
- Model evaluation errors for optical/radio modes
- Auto-conversion of noise_type between modes
- Save methods without prior data generation
- Zero events warning path

Total tests increased from 120 to 141. These tests specifically
target error handling code paths (raise, logger.error, logger.warning)
to significantly improve code coverage.
Added 23 new tests targeting previously untested code paths:
- _calculate_detection_probability method (no limiting_mag, with/without bands)
- simulate_population with selection effects and light curve generation
- infer_rate with TransientPopulation and array inputs
- Per-band cadence dictionaries and band sequences
- summary_stats with detection fraction and luminosity_distance
- energy-integrated binned counts for gamma-ray transients
- Event time sampling with explicit time ranges
- Cosmology object input (not just string)
- TransientPopulation filtering with metadata preservation
- apply_detection_criteria with TransientPopulation input

Total tests increased from 141 to 164 (161 passing, 3 pre-existing network errors).
This achieves comprehensive coverage of all major code paths including:
- All conditional branches in core methods
- Alternative input types (arrays, dicts, objects)
- Optional parameter combinations
Added 14 new targeted tests for specific uncovered code paths:
- Default prior loading (prior=None, uses model default)
- String prior loading (prior='model_name')
- Poisson draw for n_events (n_years without n_events)
- Prior redshift maximum usage in _sample_redshifts
- Successful _calculate_detection_probability paths
- Negative flux detection probability (returns 0.0)
- SimulateGenericTransient with different noise types
- Extra scatter in SimulateGenericTransient
- TransientPopulation.load structure validation
- Model error handling paths

Total tests: 164 -> 178 (175 passing, 3 pre-existing network errors)
Coverage improved from 83% (194 missing) to 85% (173 missing lines)
Added TestRemainingCoveragePaths class with 15 new tests targeting:
- Model string lookup in all_models_dict (line 43)
- ValueError for missing bands/frequency (line 59)
- Multiwavelength transients with frequency arrays (line 65)
- SNRbased noise type calculation (lines 103-105)
- Frequency sequence in cadence config (lines 367-376)
- Full optical survey event time edge cases (lines 1329, 1340)
- TransientPopulation metadata loading from JSON (lines 1553-1555)
- GammaRay model fallback exception handling (lines 2286-2299)
- Extra scatter parameter in generic transients
- Optical and radio cadence modes

Total test count increased from 178 to 193 (190 passing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants