MEWpy’s documentation¶
Introduction¶
Metabolic Engineering Workbench in python
MEWpy is a Computational Strain Optimization (CSO) tool able to aggregate different types of constraint-based models and simulation approaches. It relies on Evolutionary Algorithms (EAs) to identify the set of genetic modifications that favor and optimize a desired metabolic engineering goal. One of the main advantages of using EAs is that they enable the simultaneous optimizations of more than one objectives (product rate, growth rate, number of modifications, etc.), sometimes conflicting, and deliver in a single run a set of trade-off solutions between the objectives.
Architecture
MEWPy currently supports REFRAMED and COBRApy phenotype simulators integrating both in a common API which enables different methods:
- Flux Balance Analysis (FBA)
- Parsimonious Flux Balance Analysis (pFBA)
- Regulatory On/Off Minimization of metabolic flux (ROOM)
- Minimization of Metabolic Adjustment (MOMA)
- linear version of the Minimization of Metabolic Adjustment (lMOMA)
- Flux Variability Analysis (FVA).
MEWpy also includes implementations of Regulatory FBA (RFBA), Steady-state Regulatory FBA (SRFBA), Probabilistic Regulation of Metabolism (PROM), and CoRegFlux phenotype simulation methods. The optimization engine relies on either Inspyred or jMetalPy packages. MEWPy requires a compatible solver for linear programming problems, with installed Python dependencies installed, from the following list:
Installation¶
MEWpy may be run in python versions greater or equal to 3.6, depending on the available solver:
- CPLEX is currently available for Python<= 3.7
- Gurobi optimizer: version 9.1.* is available for Python 3.6, 3.7, 3.8 and 3.9.
Installing from PyPi
pip install mewpy
Installing from github repository
- Clone the repository
git clone https://github.com/BioSystemsUM/mewpy.git
- Run the installation script
python setup.py install
Phenotype Simulation¶
Loading metabolic models¶
Models can be loaded using REFRAMED or COBRApy:
# using REFRAMED
from reframed.io.sbml import load_cbmodel
model = load_cbmodel('iML1515.xml',flavor='cobra')
# using COBRApy
from cobra.io import read_sbml_model
model = read_sbml_model('iML1515.xml')
In addition, our mewpy.germ
module can be used to load metabolic models in SBML, JSON, COBRApy model or Reframed model formats.
Consult the documentation
for more information.
A simulator object provides a common interface to realize the main phenotype analysis tasks. The get_simulator function returns a simulator, a wrapper, for the provided model. The simulator interface remains the same regardless of how the model was loaded, using REFRAMED or COBRApy. This simplify the use of both environments and ease the management of future changes and deprecation on their APIs.
# build a phenotype simulator
from mewpy.simulation import get_simulator
simul = get_simulator(model)
The simulator offers a wide API, and enable to perform basic tasks, such as, list metabolites, reactions, genes and compartments:
simul.metabolites[:10]
['octapb_c',
'cysi__L_e',
'dhap_c',
'prbatp_c',
'10fthf_c',
'btal_c',
'6pgg_c',
'co2_e',
'akg_e',
'gsn_e']
simul.reactions[:10]
['CYTDK2',
'XPPT',
'HXPRT',
'NDPK5',
'SHK3Dr',
'NDPK6',
'NDPK8',
'DHORTS',
'OMPDC',
'PYNP2r']
simul.genes[:10]
['b2551',
'b0870',
'b3368',
'b2436',
'b0008',
'b3500',
'b2465',
'b0945',
'b4467',
'b3126']
simul.compartments
{'c': 'cytosol', 'e': 'extracellular space', 'p': 'periplasm'}
MEWpy allows to search for specific metabolites, reactions or genes using regular expressions:
# find reactions identified with a term containing 'biomass'
simul.find('biomass')
# find metabolites identified with a term containing 'o2'
simul.find('o2',find_in='m')
A simulator may also be loaded considering environmental conditions, that will be considered during phenotype simulations. In the next example, glucose consumption is limited to 10 mmol/gDW/h while oxygen is set to unlimited.
# environmental conditions
envcond = {'EX_glc__D_e': (-10.0, 100000.0),
'EX_o2_e':(-1000,1000)}
simul = get_simulator(model,envcond=envcond)
All phenotype simulations will consider the imposed environmental conditions, and as such they only need to be set once. Also, these conditions do not persistently alter the model, which can be reused with a different simulator instance.
Phenotype simulation¶
Phenotype simulations are also run using the simulator instance using the simulate
method.
# FBA
result = simul.simulate()
# or
result = simul.simulate(method='FBA')
objective: 0.8769972144269748
Status: OPTIMAL
Flux Balance Analysis (FBA) can be run without identifying any method, or by passing the ‘FBA’ as method parameter. Other phenotype simulation methods may also be run using one of the identifiers:
- Flux Balance Analysis:
method = 'FBA'
- Parsimonious FBA:
method = 'pFBA'
- Minimization of Metabolic Adjustment:
method = 'MOMA'
- Linear MOMA:
method = 'lMOMA'
- Regulatory on/off minimization of metabolic flux:
method = 'ROOM'
# pFBA
result = simul.simulate(method = 'pFBA')
objective: 769.7902399146501
Status: OPTIMAL
Reaction fluxes¶
The phenotype simulation result object, besides containing the objective value and solver status, also include reaction fluxes in the form of a dictionary:
result.fluxes
OrderedDict([('CYTDK2', 0.0),
('XPPT', 0.0),
('HXPRT', 0.0),
('NDPK5', 0.0),
('SHK3Dr', 0.33424030136519417),
('NDPK6', 0.0),
('NDPK8', 0.0),
...])
It is also to possible to retrieve reaction fluxes in the form of a data frame:
result.dataframe
# or using find to list specific fluxes ordered by their values
result.find('glc|biomass', sorted=True)
Reaction ID Flux
0 CYTDK2 0.00000
1 XPPT 0.00000
2 HXPRT 0.00000
3 NDPK5 0.00000
4 SHK3Dr 0.33424
... ... ...
2707 MPTS 0.00000
2708 MOCOS 0.00000
2709 BMOGDS2 0.00000
2710 FESD2s 0.00000
2711 OCTNLL 0.00000
2712 rows × 2 columns
Individual reaction flux values can be obtained from the dictionary representation. For example, the Prephenate dehydratase reaction flux can be obtained from the previous pFBA simulation using the reaction identifier:
result.fluxes['PPNDH']
0.16247688893081344
Retrieving and setting the model objective¶
The simulation objective, when running FBA or pFBA phenotype simulations, is, by default, the model objective which can be seen using the simulator.
simul.objective
{'BIOMASS_Ec_iML1515_core_75p37M': 1.0}
The simulator may also be used to change the model objective, for example, to optimize the ATP maintenance requirement (ATPM):
simul.objective = 'ATPM'
# or
simul.objective = {'ATPM':1}
The last enables to define objectives as linear expressions where the dictionary values are the expression coefficients.
Adding additional constraints to phenotype simulations¶
Simulations may include additional metabolic constraints on reaction fluxes. From the previous pFBA simulation one can observe that the organism does not produce L-tyrosine:
simul.objective = 'BIOMASS_Ec_iML1515_core_75p37M'
result.fluxes['EX_tyr__L_e']
0.0
Additional constraints may be added to the model so that the organism start to produce this aromatic amino acid. We may alter, for example, the 3-dehydroquinate dehydratase reaction bounds, among others, starting by verifying its initial bounds:
# initial bounds
simul.get_reaction_bounds('DHQTi')
(0.0, 1000.0)
# additional constraint
constraints = {'DHQTi' : (1, 10000),
'PPC' : (16.791619483918264, 10000),
'FADRx2': (0, 0.0),
'PDH' : (0, 0)}
# run a pFBA simulation accounting with the new constraint
result = simul.simulate(method='pFBA',constraints=constraints)
result.fluxes['EX_tyr__L_e']
0.7362937237983241
We also need to verify that the organism continues to grow:
res.fluxes['BIOMASS_Ec_iML1515_core_75p37M']
0.691926343744802
It is also possible to plot the production envelope:
from mewpy.visualization.envelope import plot_flux_envelope
plot_flux_envelope(simul,'BIOMASS_Ec_iML1515_core_75p37M','EX_tyr__L_e',constraints = constraints)
The simulate
method includes additional parameters, such as the optimization direction. For a full description please refer to the module documentation.
Flux Variability Analysis¶
The simulator interface also allows to perform Flux Variability Analysis (FVA) for L-tyrosine, returning a dictionary:
simul.FVA(reactions=['EX_tyr__L_e'])
{'EX_tyr__L_e': [0.0, 0.5721226993865252]}
or a data frame:
simul.FVA(reactions=['EX_tyr__L_e'], format='df')
Reaction ID Minimum Maximum
0 EX_tyr__L_e 0.0 0.572123
By default, MEWpy sets the model objective fraction to 90%, however this fraction may be altered. For example, one might want to consider a fraction of 10% from optimal growth:
simul.FVA(reactions=['EX_tyr__L_e'], obj_frac=0.1)
{'EX_tyr__L_e': [0.0, 5.086103322262054]}
The FVA simulations are run considering the defined environmental conditions. Additional constraints may be added, or changed, such as the ones previously used.
simul.FVA(reactions=['EX_tyr__L_e'], constraints=constraints)
{'EX_tyr__L_e': [0.0, 1.1883110320942545]}
COBRApy users may have noticed that this same task would have required many additional coding lines if using the COBRApy API directly.
Genes and reactions essentiality¶
Gene and reaction essentiality tests identify , respectively, the list of genes and reactions whose deletion would prevent the organism to grow.
simul.essential_reactions()
['SHK3Dr',
'DHORTS',
'OMPDC',
'G5SD',
'CS',
'ICDHyr',
'PPA',
'APRAUR',
'DB4PS',
'ALAR',
'RBFK',
'ASPTA',
...]
simul.essential_genes()
['b3368',
'b1260',
'b0109',
'b1094',
'b2472',
'b1261',
'b3770',
'b0243',
'b0414',
'b3041',
'b0025',
'b1662',
'b0421',
...]
Optimization Tasks¶
MEWpy is a Computational Optimization framework that uses Evolutionary Computation to solve Metabolic Engineering and Systems Biology problems. For example, MEWPY allows to find sets of metabolic modifications that favor a defined phenotypic objective, such as the production of targueted metabolites, optimize the composition of microbial communities, medium composition, or enzymes kinetic rates. Optimization tasks are defined as optimization problems which encompass a metabolic model and optimization objectives. Additional arguments may be added depending on the problem specificities.
problem = Problem(model,[f_obj_1, f_obj_2, f_obj_3],**kwargs)
Optimization Engines¶
A problem may contemplate one or more optimization objectives. The number of objectives will determine the Evolutionary Algorithms best suited to solve problem. Single objective problems may be addressed using a Genetic Algorithm (GA) or Simulated Annealing (SA), problems with two objectives may be solved using the Non-sorting Genetic Algorithm (NSGA-II) or the Strength Pareto Evolutionary Algorithm (SPEA2), while many objective problems, with three or more objectives, require the use of algorithms such as NSGA-III or the Hypervolume Estimation algorithm (HypE).
Invoking an EA to solve a defined optimization problem is a straight forward task as depicted in the following example:
from mewpy.optimization import EA
ea = EA(problem, max_generations=100, algorithm='NSGAIII')
ea.run()
Presently, MEWpy uses optimization engines provided by Inspyred and JMetalPy packages. The last is preferred, not only because it is actively maintained, but also because it provides a broader set of EAs and tools.
Strain Optimization Strategies and Methods¶
MEWpy supports different models, some integrating omic data, such as enzymatic expression (GECKO and sMOMENT models) others integrating regulatory networks. The availability of such models enables MEWpy to suggest modifications in distinct layers, targeting modifications on reaction fluxes, gene expression, enzyme availability, regulatory genes and transcription factors. Modifications follow two main strategies, notably, deletions and over/under expression of targets.
Reactions¶
Deletion on reactions fluxes are achieved by setting the reactions lower and upper bounds to 0.
Under expressing a forward reaction consists on setting its flux upper bound to a value lesser to a reference flux value, while under expressing a backward reaction flux consists on setting its lower bound to a value greater to the reference flux value. For reversible reactions, the reference flux direction sets the direction to be preserved, and the lower bound or upper bound is set to 0 if and respectively the reference flux direction is forward or backward. Under expressing a reaction with no flux on the reference is equivalent to a deletion.
Over expressing a forward reaction consists on setting its flux lower bound to a value greater to the reference flux value, while over expressing a backward reaction flux consists on setting its upper bound to a value lesser to the reference flux value. Similarly to the under expression of reactions strategy, the reference flux direction sets the direction to be preserved.
Reference flux values may be defined by a user or automatically computed by MEWpy. The last considers two strategies: 1) use the wild type flux distribution as reference or 2) perform a two step simulation process where deletions are firstly applied to derive a reference flux distribution which will be used to set the over/under regulations on a second step.
To configure a problem with a customized reference distribution, just pass the reference, a dictionary of reaction: value, as a problem parameter:
problem = Problem(model, objectives, reference=reference)
To use the wild type flux distribution as reference, computed for the provided medium, set the twostep
parameter to False
:
problem = Problem(model, objectives, envcond=medium, twostep=False)
The two step approach is applied by default by MEWpy.
Genes¶
A combinatorial solution that over- or under expresses a set of genes is propagated to the catalyzed reactions converting transcription/translational information into constraints over the fluxes of the reactions (see Fig. 2). To that end, and taking as reference the flux distribution resulting from applying the deletions contained in the genetic modifications, the Boolean operators OR/AND in Gene-Protein-Reaction (GPR) rules are translated into functional operators (by defaults MAX and MIN functions) asserting new reaction flux constraints.
Fig.2: Propagation of gene expression values to constraints over reaction fluxes.
As an example, lets suppose a reaction R has a GPR rule ‘(G1 or G2) and (G3)’ associated to it.
gpr = '(G1 or G2) and G3'
MEWpy converts the GPR rule into a parsing tree (using a Boolean syntax) that is later used for evaluation.
from mewpy.util.parsing import build_tree, Boolean
tree = build_tree(gpr,Boolean)
The parsing can be visualized as prefix representation of the rule or as a tree:
print(tree)
& ( | ( G1 , G2 ) , G3 )
tree.print_node()
|____[&]
|____[|]
|____ G1
|____ G2
|____ G3
To evaluate the deletion of a gene, for example G1, we define the list of ON genes, that is, G2 and G3, and instantiate a BooleanEvaluator
.
from mewpy.util.parsing import BooleanEvaluator
be = BooleanEvaluator(['G2','G3'])
tree.evaluate(be.f_operand, be.f_operator)
True
That is, there is no impact on the reaction if gene G1 is deleted.
Now lets suppose that the only OFF gene is G3:
be = BooleanEvaluator(['G1','G2'])
tree.evaluate(be.f_operand, be.f_operator)
False
As expected, when deleting G3, the reaction R is knockout.
Over or under expressing genes fallows a similar process, but the Boolean operators are replaced with functions. To that end, we use a GeneEvaluator, identifying the functions to replace the [AND,OR] operators, as well as the genes expression:
from mewpy.util.parsing import GeneEvaluator
# G1 and G3 are under expressed to half, and a quarter.
# G2 is over expressed to twice its expression value.
genes = {'G1':0.5,'G2':2,'G3':0.25}
# min replaces AND
# max replaces OR
evaluator = GeneEvaluator(genes, min,max)
Now, let us evaluate the GPR rule for reaction R using the newly define evaluator:
tree.evaluate(be.f_operand, be.f_operator)
0.25
That is, the reaction flux will be under expressed to 0.25 of the reference.
Enzymes¶
Genome-Scale Metabolic Models, although being efficient in modeling an organism’s gene-protein-reaction (GPR) interactions, are oblivious to other equally important factors (e.g. enzyme kinetics and abundance, Transcription Factors, signaling, etc.) which affect cells’ metabolism and transcriptional regulation. The incorporation of such elements as additional constraints leads to better and more accurate phenotype prediction. GECKO (GSMM with enzymatic constraints using kinetic and omics data) and sMOMENT (short MetabOlic Modelling with ENzyme kineTics) models enhance GSMMs by imposing soft constraints as upper bounds on fluxes representing the overall enzyme usage. GEMs stoichiometric matrices are extended by adding new rows that represent the enzymes and new columns that represent each enzyme’s usage. Kinetic information is included, in each added row, as stoichiometric coefficients in the form of enzymes’ turnover (kcat) values. Also, an identity sub-matrix over the added rows and columns permits to model each enzyme usage upper bound, respecting therefore the constraints on each flux.
The computational strain optimization method performs modifications to the organism’s metabolism by over- and under expressing protein values (see Fig. 3), which are reflected in the catalyzed reaction bounds. The optimization procedure, defined as in the previous sections, delivers solutions that reflect the over- or under expression of proteins whose usage is limited by enzyme upper bound constraints and by the protein pool inside the cell.
Fig.3 :Protein over- and under expression.
Kinetic models of metabolism may also be explored to engineer mutant strains. Indeed, modifications on maximum velocity parameters values can be explored as proxies for enzyme concentrations. MEWpy, makes availabe a set of kinetic optimization problems that enables the exploration of such strategies.
Regulatory Networks¶
A metabolic model alone has a significant limitation in revealing condition-specific metabolic activity because gene regulation plays an important role in constraining the particular metabolism available under any given condition. Also, the complex cross-talking mechanisms between gene regulation and metabolism are not captured by a metabolic model alone. As such, methods that systematically integrate a transcriptional regulatory network and a metabolic network have been put forward, including Regulatory Flux Balance Analysis (RFBA), the Steady-state Regulatory FBA (SRFBA), Probabilistic Regulation of Metabolism (PROM), Integrated Deduced REgulation And Metabolism (IDREAM), and CoRegFlux.
Computational strain optimization has benefited from the integration of regulatory and metabolic networks, proposing more effective metabolic engineering strategies. OptORF is one of the earliest strategies which aims to identify the optimal metabolic and regulatory gene deletions as well as gene over-expression that maximize biochemical production at the maximum cellular growth under transcriptional regulatory constraints. OptORF is a bi-level optimization problem. While the inner problem maximizes growth under the given gene deletions, the regulatory states are determined by the constraints of the outer problem. One limitation of OptORF is that it is based on a mechanical (Boolean) representation of the regulatory network and therefore ignores the range of possible regulatory intensities between regulatory factors and the target genes. Also, Boolean networks can only suggest the manipulation of transcription factors by knockouts (ON to OFF) and cannot guide more quantitative adjustment of transcriptional regulation. Nonetheless, this strategy was included in MEWpy to enable a broader set of possible solutions. Regulatory rules are often complex and may include conditions. Next, we present an example of how MEWpy evaluates such a rule:
# boolean example with conditions.
expression = "( (Lrp AND NOT (leu_L_e_>0)) OR NOT(((GlnG AND GlnB AND GlnD) AND RpoN) \
AND ((glu_L_e_>0) OR (arg_L_e_>0) OR (asp_L_e_>0) OR (his_L_e_>0) OR (pro_L_e_>0) )))"
t = build_tree(expression, Boolean)
true_list = ['GlnG']
# dict of variables values
values = {'leu_L_e_': 1, 'glu_L_e_': 7, "arg_L_e_": 0.5, "asp_L_e_": 2, "his_L_e_": 0, "pro_L_e_": 0}
# propositions in the list are evaluated as True and the remaining as False.
evaluator = BooleanEvaluator(true_list, values)
res = t.evaluate(evaluator.f_operand, evaluator.f_operator)
True
More recently, OptRAM (Optimization of Regulatory And Metabolic Networks) was developed as a novel strain design algorithm that can identify combinatorial optimization strategies including over-expression, knockdown or knockout of both metabolic genes and transcription factors. OptRAM is based on IDREAM integrated network framework, which makes it able to deduce a regulatory network from data. While the original OptRAM uses a single objective Simulated Annealing algorithm to explore the solution search space, the MEWpy implementation uses MOEAs which improves on the solution’s diversity and convergence.
Evaluation Functions¶
All evaluation functions/optimization objectives are defined in mewpy.optimization.evaluation
module.
Biomass-Product Coupled Yield (BPCY):
The maximization of the Biomass-Product Coupled Yield is one of the most commonly used objectives in Computational Strain Optimization.
from mewpy.optimization.evaluation import BPCY
fevaluation = BPCY(biomass_reaction_id,product_reaction_id)
By default, MEWPy computes reaction yields using pFBA. This may thought be altered by defining an alternative phenotype simulation method, such as lMOMA.
fevaluation = BPCY(biomass_reaction_id,product_reaction_id,method ='lMOMA')
Also, the BPCY computation may account for a carbon source or substrate consumption:
from mewpy.optimization.evaluation import BPCY
fevaluation = BPCY(<biomass_reaction_id>,<product_reaction_id>,\
uptake=<substrate_reaction_id>)
Weighted Yield (WYIELD):
BPCY, on its own, has some limitations. Although the BPCY score of a mutated solution may be high, the flux value of the target reaction may be unstable with the max biomass. To guide the EA to more robust solutions, MEWpy also includes a weight yield objective, that encompasses the target product flux variability, constrained to a minimal growth and introduced metabolic modifications.
from mewpy.optimization.evaluation import WYIELD
fevaluation = WYIELD(<biomass_reaction_id>,<product_reaction_id>)
The trade-off parameter is by default set to 0.3. However it may be altered by adding a new parameter when instantiating the class, for example, alpha=0.5
.
fevaluation = WYIELD(<biomass_reaction_id>,<product_reaction_id>,alpha=0.5)
The minimum growth yield may be explicitly defined, min_biomass_value=<some_value>
, or as a percentage of the wild type biomass, min_biomass_per=0.1
, that is 10%.
BPCY with FVA
MEWpy also includes an objective function that combines BPCY and WYIELD, whose formulation is:
from mewpy.optimization.evaluation import BPCY_FVA
fevaluation = BPCY_FVA(<biomass_reaction_id>,<product_reaction_id>,uptake=<substrate_reaction_id>)
As in BPCY, the substrate is optional and fluxes may be obtained using different phenotype simulation methods.
This objective function is based on a proposal from “OptRAM: In-silico strain design via integrative regulatory-metabolic network modeling” where the additional factor to BPCY favors solutions with a smaller gap between the product minimum an maximum FVA.
Product Yield
The most straight forward objective function is the product yield, where the goal is to maximize the production of a target product. In the MEWpy implementation, this objective function may consider a minimum biomass production explicitly defined or a percentage of the wild type growth. This objective function also may consider distinct phenotype simulation methods. Also, this same objective function may be used for the minimization of a targeted reaction flux by changing the optimization sense, that is, by setting the argument maximize=False
.
from mewpy.optimization.evaluation import TargetFlux
fevaluation = TargetFlux(<product_reaction_id>)
Minimum number of modifications
Although problems definition allows for setting a maximum number of modifications, and solutions from the final population may be automatically simplified to remove unnecessary modifications, the minimization of the number of perturbations can be set as an additional optimization objective.
from mewpy.optimization.evaluation import MinCandSize
fevaluation = MinCandSize()
Modification type
This objective favors modifications with deletions and down regulations. As such, a solution that encompasses more deletions is considered better than one with many up regulations.
from mewpy.optimization.evaluation import ModificationType
fevaluation = ModificationType()
Molecular weight
Minimizes the sum of molecular weights of the products/substrates of a set of reactions (g/gDW/h).
from mewpy.optimization.evaluation import MolecularWeight
fevaluation = MolecularWeight([r_id_1,r_id_2,...])
Combining two or more objectives
The previously defined objective functions may be combined into a linear aggregated weighed sum and used in single objective optimization algorithms, such as Genetic Algorithm or Simulated Annealing.
Though the sum of all weights should be equal to 1, this is not imposed as weights may also be used to introduce a normalization for each function. When not provided, the aggregated function assigns a same weight to all functions w=1/n.
from mewpy.optimization.evaluation import BPCY, WYIELD, AggregatedSum
f1 = BPCY(biomass_reaction_id,product_reaction_id,method ='lMOMA')
f2 = WYIELD(<biomass_reaction_id>,<product_reaction_id>)
fevaluation = AggregatedSum([f1,f2],tradeoffs=[0.7,0.3])
Optimization Problems¶
Reaction Constraint Problems¶
# load the model
from reframed.io.sbml import load_cbmodel
model = load_cbmodel('iJO1366SL.xml', flavor='cobra')
# Define the target
PRODUCT_ID = 'R_EX_tyr_DASH_L_LPAREN_e_RPAREN_'
BIOMASS_ID = 'R_Ec_biomass_iJO1366_core_53p95M'
# environmental conditions
envcond = {'R_EX_o2_LPAREN_e_RPAREN_' : (-9.66, 100000.0),
'R_EX_glc_LPAREN_e_RPAREN_' : (-12.5,100000.0)}
}
# Optimization objectives
from mewpy.optimization.evaluation import BPCY, WYIELD
evaluator_1 = BPCY(BIOMASS_ID, PRODUCT_ID, method='lMOMA')
evaluator_2 = WYIELD(BIOMASS_ID, PRODUCT_ID)
# build a new reaction deletion problem instance
from mewpy.problems import RKOProblem
problem = RKOProblem(model,
fevaluation=[evaluator_1, evaluator_2],
envcond=envcond)
# build a new reaction over/under expression problem instance
from mewpy.problems import ROUProblem
problem = ROUProblem(model,
fevaluation=[evaluator_1, evaluator_2],
envcond=envcond)
# run the optimization
from mewpy.optimization import EA
ea = EA(problem, max_generations= 100, visualizer=True)
final_pop = ea.run()
When the optimization is concluded, the final population is simplified by removing genetic modifications that do not impair the objectives. This step, which also includes the filtering of duplicated solutions, may be skipped:
final_pop = ea.run(simplify=False)
Gene Constraint Problems¶
Optimizations of genes’ expression are run by setting and running the intended problem. Gene deletion optimization problems are defined as a GKOProblem while gene over- or under expression optimization use the GOUProblem class.
# build a new problem instance
from mewpy.problems import GKOProblem
problem = GKOProblem(model, fevaluation=[
evaluator_1, evaluator_2], envcond=envcond)
# build a new problem instance
from mewpy.problems import GOUProblem
problem = GOUProblem(model, fevaluation=[
evaluator_1, evaluator_2], envcond=envcond)
Enzymatic Constraints Problems¶
MEWpy enables strain optimization using Genome-scale models enhanced with enzymatic (kcat) parameters and enzyme mass constraints:
- MEWpy supports GECKO models, from the original COBRApy based implementation, but also implemented over REFRAMED package.
- MEWpy also supports sMOMENT and GECKO like models obtained from AutoPACMEN .
The optimization API is common to the one defined for the optimization of metabolic constraints (Reactions and Genes). MEWpy automatically selects the phenotype simulator for the loaded model.
# load the model
from mewpy.model.gecko import GeckoModel
model = GeckoModel('single-pool')
# Define the target
PRODUCT_ID = 'r_1913'
BIOMASS_ID = 'r_2111'
# environmental conditions
envcond = {'r_1714_REV' : (-12.5,100000.0)}
# Optimization objectives
from mewpy.optimization.evaluation import BPCY, WYIELD
evaluator_1 = BPCY(BIOMASS_ID, PRODUCT_ID, method="lMOMA")
evaluator_2 = WYIELD(BIOMASS_ID, PRODUCT_ID)
# build a new problem instance for enzymatic OU
from mewpy.problems import GeckoOUProblem
problem = GeckoOUProblem(model, fevaluation=[
evaluator_1, evaluator_2], envcond=envcond)
# run the optimization
from mewpy.optimization import EA
ea = EA(problem, max_generations= 100, visualizer=True)
final_pop = ea.run()
Regulatory Constraints¶
MEWpy implements computational strain design optimization with regulatory constraints. Presently two methods are available, OptRAM and OptORF.
OptRAM Example¶
from mewpy.problems.optram import OptRamProblem, load_optram
# regulatory matrix Genes x TFs
matrix_file = 'regnet.csv'
# csv file mapping genes names entries in the regulatory matrix
gene_file = 'mgene.csv'
# csv with TFs expression
tf_file ='TFnames.csv'
BIOMASS_ID = 'r_2111'
PRODUCT_ID = 'r_1913' #TYR
GLC = 'r_1714'
# build the regulatory network
# add the prefix 'G_' to genes. Only for REFRAMED models
regnet = load_optram(gene_file, tf_file, matrix_file, gene_prefix='')
# load the model
from cobra.io import read_sbml_model
model = read_sbml_model('yeast_7.6-optram.xml')
# define the optimization objectives
from mewpy.optimization.evaluation import BPCY, WYIELD
evaluator_1 = BPCY(BIOMASS_ID, PRODUCT_ID, method="lMOMA")
evaluator_2 = WYIELD(BIOMASS_ID, PRODUCT_ID)
# environmental conditions
envcond = {GLC:(-12.5,10000)}
# instantiate the problem
problem = OptRamProblem(model, [evaluator_1, evaluator_2],
regnet, candidate_min_size=1, candidate_max_size=6, envcond = envcond)
from mewpy.optimization import EA
ea = EA(problem, max_generations=100, mp=True)
final_pop = ea.run()
OptORF Example¶
from mewpy.io import Reader, Engines, read_model
from mewpy.optimization import EA, BPCY, WYIELD
from mewpy.problems import OptORFProblem
# load a GERM model. Consult the documentation (mewpy.germ) for more details
metabolic_reader = Reader(Engines.MetabolicSBML, 'iJR904.xml')
regulatory_reader = Reader(Engines.BooleanRegulatoryCSV, 'iMC1010.csv',
sep=',', id_col=0, rule_col=4, aliases_cols=[1, 2, 3], header=0)
model = read_model(metabolic_reader, regulatory_reader)
BIOMASS_ID = 'BiomassEcoli'
GLC = 'EX_glc_DASH_D_e'
PRODUCT_ID = 'EX_succ_e'
# OptORF can be used with an initial state of the regulatory network.
initial_state = {
'Stringent': 0.0,
'high-NAD': 0.0,
'AGDC': 0.0,
}
model.objective = {BIOMASS_ID: 1}
model.get(GLC).bounds = (-18.5, 0.0)
model.get(BIOMASS_ID).lower_bound = 0.1
evaluator_1 = BPCY(BIOMASS_ID, PRODUCT_ID)
evaluator_2 = WYIELD(BIOMASS_ID, PRODUCT_ID)
problem = OptORFProblem(model, [evaluator_1, evaluator_2], initial_state=initial_state, candidate_max_size=6)
ea = EA(problem, max_generations=10, mp=True)
final_pop = ea.run()
GERM models and analysis¶
MEWpy supports the integration of regulatory and metabolic models at the genome-scale.
All tools required to build, simulate, and analyze GEnome-scale Regulatory and Metabolic (GERM) models
are available in the mewpy.germ
module.
A GERM model includes a standard Genome-Scale Metabolic (GEM) model. The GEM model comprehends reactions (w/ GPRs), metabolites and genes. It also includes exchange reactions defining the environmental conditions of the system. In addition, GERM models include Transcriptional Regulatory Networks (TRNs). A TRN comprehends interactions (w/ boolean algebra expressions), target genes and regulators. It also includes external stimuli (effectors), regulatory metabolites, and regulatory reactions. GEM models and TRNs are often linked by the target genes of the TRN and the genes of the GEM model.
MEWpy supports several methods to perform phenotype simulations using integrated GERM models.
The following simulation methods are available in mewpy.germ.analysis
module:
FBA
- Metabolic model onlypFBA
- Metabolic model onlyRFBA
- Regulatory-Metabolic modelSRFBA
- Regulatory-Metabolic modelPROM
- Regulatory-Metabolic modelCoRegFlux
- Regulatory-Metabolic model
Reading GERM models¶
In this example, we will be using the integrated E. coli core model published by Orth et al, 2010.
E. coli integrated model is available in two separate files:
- metabolic model examples/models/germ/e_coli_core.xml
- regulatory model examples/models/germ/e_coli_core_trn.csv
To assemble an integrated GERM model, we use mewpy.io.read_model
function.
This function accepts multiple readers having different engines.
MEWpy contains the following engines that can be used in the Reader
object:
BooleanRegulatoryCSV
- reads a TRN from a CSV file - regulatory interactions:target, (regulator1 and regulator2)
CoExpressionRegulatoryCSV
- reads a TRN from a CSV file - regulatory interactions:target, co-activator1 co-activator2, co-repressor2
TargetRegulatorRegulatoryCSV
- reads a TRN from a CSV file - regulatory interactions:target, regulator1
RegulatorySBML
- reads a TRN from a SBML file using the SBML-QUAL pluginMetabolicSBML
- reads a GEM from a SBML fileCobraModel
- reads a GEM from a COBRApy modelReframedModel
- reads a GEM from a Reframed modelJSON
- reads a GERM model from a JSON file
In addition, the Reader
accepts other arguments such as the filename
, sep
, among others.
from mewpy.io import Reader, Engines, read_model
# a reader for the E. coli core GEM model
gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
# a reader for the E. coli core TRN model
# (it accepts specific parameters for reading the TRN CSV file)
trn_reader = Reader(Engines.BooleanRegulatoryCSV, 'e_coli_core_trn.csv',
sep=',', id_col=0, rule_col=2, aliases_cols=[1], header=0)
# reading the integrated regulatory-metabolic model
model = read_model(gem_reader, trn_reader)
model
Model | e_coli_core |
---|---|
Name | E. coli core model - Orth et al 2010 |
Types | regulatory, metabolic |
Compartments | e, c |
Reactions | 95 |
Metabolites | 72 |
Genes | 137 |
Exchanges | 20 |
Demands | 0 |
Sinks | 0 |
Objective | Biomass_Ecoli_core |
Regulatory interactions | 159 |
Targets | 159 |
Regulators | 45 |
Regulatory reactions | 12 |
Regulatory metabolites | 11 |
Environmental stimuli | 0 |
Although mewpy.io.read_model
function is the preferred interface for reading models, MEWpy contains other read/write methods available at mewpy.io
.
Working with GERM models¶
A GERM model contains the following metabolic information:
objective
- variable/coefficient dictionaryreactions
- identifier/reaction dictionarymetabolites
- identifier/metabolite dictionarygenes
- identifier/gene dictionarygprs
- identifier/GPR expression dictionarycompartments
- identifier/compartment dictionaryexchanges
- identifier/reaction dictionarydemands
- identifier/reaction dictionarysinks
- identifier/reaction dictionaryexternal_compartment
- Compartment with most exchange reactions
A GERM model contains the following regulatory information:
interactions
- identifier/interaction dictionarytargets
- identifier/target dictionaryregulators
- identifier/regulator dictionaryregulatory_reactions
- identifier/reaction dictionaryregulatory_metabolites
- identifier/metabolite dictionaryenvironmental_stimuli
- identifier/regulator dictionary
One can inspect model attributes in Jupyter notebooks:
# read E. coli core model
from mewpy.io import Reader, Engines, read_model
gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
trn_reader = Reader(Engines.BooleanRegulatoryCSV, 'e_coli_core_trn.csv',
sep=',', id_col=0, rule_col=2, aliases_cols=[1], header=0)
model = read_model(gem_reader, trn_reader)
model.objective
{Biomass_Ecoli_core || 1.496 3pg_c + ...
model.reactions
'ACALDt': ACALDt || 1.0 acald_e <-> 1.0 acald_c,
'ACKr': ACKr || 1.0 ac_c + 1.0 atp_c <-> 1.0 actp_c + 1.0 adp_c,
'ACONTa': ACONTa || 1.0 cit_c <-> 1.0 acon_C_c + 1.0 h2o_c,
'ACONTb': ACONTb || 1.0 acon_C_c + 1.0 h2o_c <-> 1.0 icit_c,
'ACt2r': ACt2r || 1.0 ac_e + 1.0 h_e <-> 1.0 ac_c + 1.0 h_c,
'ADK1': ADK1 || 1.0 amp_c + 1.0 atp_c <-> 2.0 adp_c,
'AKGDH': AKGDH || 1.0 akg_c + 1.0 coa_c + 1.0 nad_c -> 1.0 co2_c + 1.0 nadh_c + 1.0 succoa_c,
...
model.interactions
'b0080_interaction': b0080 || 1 = ( ~ surplusFDP),
'b0113_interaction': b0113 || 1 = ( ~ surplusPYR),
'b0114_interaction': b0114 || 1 = (( ~ b0113) | b3261),
'b0115_interaction': b0115 || 1 = (( ~ b0113) | b3261),
...
A GERM model container is a regular Python dictionary. They can be used to access variables in the model (e.g.,
model.reactions['MY_REACTION']
).
One can also yield variables from the model using yield_...
-like methods, such as model.yield_regulators()
.
# get PDH reaction from the model
pdh = model.reactions['PDH']
pdh
Identifier | PDH |
---|---|
Name | |
Aliases | |
Model | e_coli_core |
Types | reaction |
Equation | 1.0 coa_c + 1.0 nad_c + 1.0 pyr_c -> 1.0 accoa_c + 1.0 co2_c + 1.0 nadh_c |
Bounds | (0.0, 1000.0) |
Reversibility | False |
Metabolites | coa_c, nad_c, pyr_c, accoa_c, co2_c, nadh_c |
Boundary | False |
GPR | (b0115 & b0114 & b0116) |
Genes | b0115, b0114, b0116 |
Compartments | c |
Charge balance | {'reactants': 6.0, 'products': -6.0} |
Mass balance | {'C': 0.0, 'H': 0.0, 'N': 0.0, 'O': 0.0, 'P': 0.0, 'S': 0.0} |
# iterate over regulators
for regulator in model.yield_regulators():
print(regulator)
break
b0113 || (0.0, 1.0)
NOTE
It is possible to add variables to the model dictionaries (e.g., model.reactions['MY_REACTION'] = my_reaction
),
but this is not recommended. The model dictionaries are used to keep track of the model variables
and should not be modified directly.
To add or remove variables, use the model.add()
and model.remove()
methods documented bellow. These methods
will perform the required updates in the model.
GERM model operations¶
A GERM model supports the following operations:
get(identifier, default=None)
- It retrieves the variable using its identifieradd(variables)
- It adds new variables to the model; variables will be added to model containers according to their typesremove(variables)
- It removes variables from the model; variables will be removed from model containers according to their typesupdate(variables, objective, ...)
- It updates variables, compartments, objective, etc, in the modelcopy()
- It makes a shallow copy of the modeldeepcopy()
- It makes a deep copy of the modelto_dict()
- It exports the model to a dictionary
# get the Crp regulator
crp = model.get('b3357')
crp
Identifier | b3357 |
---|---|
Name | b3357 |
Aliases | b3357, Crp |
Model | e_coli_core |
Types | regulator, target |
Coefficients | (0.0, 1.0) |
Active | True |
Interactions | b0721_interaction, b0722_interaction, b0723_interaction, b0724_interaction, b0902_interaction, b0903_interaction, b0904_interaction, b1524_interaction, b2492_interaction, b3114_interaction, b3115_interaction, b3870_interaction, b4122_interaction |
Targets | b0721, b0722, b0723, b0724, b0902, b0903, b0904, b1524, b2492, b3114, b3115, b3870, b4122 |
Environmental stimulus | False |
Interaction | b3357 || 1 = CRPnoGLC |
Regulators | CRPnoGLC |
# remove the regulator from the model
model.remove(crp)
'b3357' in model.regulators
False
# add the regulator back to the model
model.add(crp)
'b3357' in model.regulators
True
# shallow copy only performs a copy of the containers
model_copy = model.copy()
print(model is model_copy)
# but variables are still the same
crp is model_copy.regulators['b3357']
# deep copy performs a copy of the containers and variables
model_copy = model.deepcopy()
crp is model_copy.regulators['b3357']
False
True
False
# export the model to a dictionary
model.to_dict()
A GERM model supports temporary changes using the with model
context manager.
In addition, one can manually undo()
and redo()
the last operations or reset()
and restore()
a GERM model.
# make a temporary change to the model
pfk = model.get('PFK')
with model:
model.remove(pfk)
print('Is PFK in the model?', 'PFK' in model.reactions)
print('Has PFK removal been reverted?', 'PFK' in model.reactions)
Is PFK in the model? False
Has PFK removal been reverted? True
A GERM model is by default a multi-type model. This means that one can manipulate both a metabolic and regulatory model at the same time. Alternatively, one can manipulate a single regulatory or metabolic model.
MEWpy allows building single- or multi-type models easily:
from mewpy.germ.models import RegulatoryModel
# creating a new regulatory model
reg_model = RegulatoryModel(identifier='my_regulatory_model')
reg_model
Model | my_regulatory_model |
---|---|
Name | my_regulatory_model |
Types | regulatory |
Compartments | |
Regulatory interactions | 0 |
Targets | 0 |
Regulators | 0 |
Regulatory reactions | 0 |
Regulatory metabolites | 0 |
Environmental stimuli | 0 |
# check if the model is metabolic
reg_model.is_metabolic()
False
from mewpy.germ.models import Model
pfk = model.get('PFK').deepcopy()
# using types model constructor
met_model = Model.from_types(('metabolic', ),
identifier='my_metabolic_model',
reactions={'pfk': pfk})
met_model
Model | my_metabolic_model |
---|---|
Name | my_metabolic_model |
Types | metabolic |
Compartments | |
Reactions | 1 |
Metabolites | 5 |
Genes | 2 |
Exchanges | 0 |
Demands | 0 |
Sinks | 0 |
Objective | None |
Working with GERM model variables¶
MEWpy includes the following metabolic and regulatory variables:
Reaction
- Object to represent metabolic reactions having bounds, stoichiometry (metabolite/coefficient) and GPRsMetabolite
- Object to represent metabolic compounds having charge, compartment, formula and reactionsGene
- Object to represent metabolic genes having coefficients and reactions (found in GPR expressions)Interaction
- Object to represent regulatory interactions having a target and associated regulatory events (coefficient/boolean rule)Target
- Object to represent regulatory targets having coefficients and interactionRegulator
- Object to represent regulatory having coefficients and interactions
Variables have different attributes that can be inspected and changed. Variables are often connected to other variables and have special attributes, such as boolean expressions, coefficients and dictionaries of metabolites (stoichiometry).
All GERM model variables support:
copy()
- It makes a shallow copy of the modeldeepcopy()
- It makes a deep copy of the model- Temporary changes using
with
,undo()
,redo()
,reset()
,restore()
, - yield linked variables, such as
yield_metabolites()
Reactions, Metabolites and Genes¶
Reactions have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- bounds - reaction bounds; it must be a tuple with both values; (-1000, 1000) by default
- lower_bound - reaction lower bound
- upper_bound - reaction upper bound
- reversibility - whether the reaction is reversible
- stoichiometry - reaction stoichiometry; a dictionary of metabolite variable-coefficient
- gpr - a symbolic expression containing the boolean logic of the gene variables; AND (symbolic &); OR (symbolic |)
- gene_protein_reaction_rule - symbolic representation of the GPR expression
- metabolites - reaction metabolites; a dictionary of metabolite identifier-metabolite variable
- reactants - reaction reactants; a dictionary of metabolite identifier-metabolite variable
- products - reaction products; a dictionary of metabolite identifier-metabolite variable
- compartments - all compartments associated with the reaction metabolites
- boundary - whether the reaction is exchange, demand or sink
- equation - notation with reactants, products and reversibility
- charge_balance - charge balance of the reaction
- mass_balance - mass balance of the reaction
and the following methods:
ko()
- reaction deletion; it sets the bounds to zeroadd_metabolites(stoichiometry)
- add metabolites to the reactionremove_metabolites(metabolite)
- remove metabolites from the reactionadd_gpr(gpr)
- add/replacing gpr to the reactionremove_gpr()
- remove gpr from the reaction
Metabolites have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- charge - metabolite charge
- compartment - metabolite compartment
- formula - metabolite chemical formula
- atoms - frequency of each atom in the chemical formula
- molecular_weight - metabolite molecular weight
- exchange_reaction - the first exchange reaction associated with the metabolite
- exchange_reactions - the list of all exchange reactions associated with the metabolite
- reactions - the reactions associated with this metabolite; a dictionary of reaction identifier-reaction variable
Genes have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- coefficients - the gene coefficients; all possible values that a gene can take during GPR evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- reactions - the reactions associated with this gene; a dictionary of reaction identifier-reaction variable
and the following methods:
ko()
- gene deletion; it sets the coefficients to zero
Bold-italicized properties can be set with new values (e.g., reaction.bounds = (0, 1000)
).
# read E. coli core model
from mewpy.io import Reader, Engines, read_model
gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
trn_reader = Reader(Engines.BooleanRegulatoryCSV, 'e_coli_core_trn.csv',
sep=',', id_col=0, rule_col=2, aliases_cols=[1], header=0)
model = read_model(gem_reader, trn_reader)
# inspecting a reaction
ack = model.get('ACKr')
ack
Identifier | ACKr |
---|---|
Name | |
Aliases | |
Model | e_coli_core |
Types | reaction |
Equation | 1.0 ac_c + 1.0 atp_c <-> 1.0 actp_c + 1.0 adp_c |
Bounds | (-1000.0, 1000.0) |
Reversibility | True |
Metabolites | ac_c, atp_c, actp_c, adp_c |
Boundary | False |
GPR | (b2296 | b3115 | b1849) |
Genes | b2296, b3115, b1849 |
Compartments | c |
Charge balance | {'reactants': 5.0, 'products': -5.0} |
Mass balance | {'C': 0.0, 'H': 0.0, 'O': 0.0, 'N': 0.0, 'P': 0.0} |
One can create Reactions, Metabolites and Genes using the objects mentioned above.
# imports
from mewpy.germ.algebra import Expression, parse_expression
from mewpy.germ.variables import Reaction, Metabolite, Gene
# creating the Genes
g1 = Gene(identifier='b4067', name='actP', coefficients=(0, 1))
g2 = Gene(identifier='b0010', name='satP', coefficients=(0, 1))
# Creating the GPR. A GPR is a boolean algebra expression
boolean_rule = parse_expression('b4067 and b0010')
genes = {'b4067': g1, 'b0010': g2}
gpr = Expression(symbolic=boolean_rule, variables=genes)
# creating the metabolites
m1 = Metabolite(identifier='ac_c', name='acetate cytoplasm', compartment='c', formula='C2H3O2', charge=-1)
m2 = Metabolite(identifier='ac_e', name='acetate extracellular', compartment='e', formula='C2H3O2', charge=-1)
# creating the reaction
stoichiometry = {m1: -1, m2: 1}
rxn = Reaction(identifier='ac_t',
name='acetate transport',
bounds=(0, 1000),
stoichiometry=stoichiometry,
gpr=gpr)
rxn
Identifier | ac_t |
---|---|
Name | acetate transport |
Aliases | |
Model | None |
Types | reaction |
Equation | 1 ac_c -> 1 ac_e |
Bounds | (0, 1000) |
Reversibility | False |
Metabolites | ac_c, ac_e |
Boundary | False |
GPR | (b4067 & b0010) |
Genes | b4067, b0010 |
Compartments | c, e |
Charge balance | {'reactants': 1, 'products': -1} |
Mass balance | {'C': 0, 'H': 0, 'O': 0} |
Reactions can be created automatically from GPRs in a string format. This avoids creating GPR expressions manually using the boolean expression parser. Note that, Genes are also created automatically using the identifiers in the string.
# from a GPR string
rxn3 = Reaction.from_gpr_string(identifier='ac_t2',
name='a second reaction for acetate transport having different genes',
rule='b0001 and b0002',
bounds=(0, 1000),
stoichiometry=stoichiometry)
A Reaction’s GPR is a boolean algebra expression that can be evaluated using regular boolean operators or custom operators (useful to evaluate gene expression data).
# gpr is a boolean algebra expression that can be evaluated
rxn3.gpr.evaluate(values={'b0001': 1, 'b0002': 1})
True
from mewpy.germ.algebra import And
# using a custom operator for AND
rxn3.gpr.evaluate(values={'b0001': 100, 'b0002': 50}, operators={And: min})
50
Interactions, Targets and Regulators¶
Interactions have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- target - interaction target; Interactions can only have a single target gene!
- regulatory_events - a dictionary of coefficient-symbolic expressions. The symbolic expressions contain the boolean logic of regulators to activate or not the target gene; the key of a regulatory event is the expression coefficient that the target can take if the expression is evaluated to True.
- regulators - interaction regulators; a dictionary of regulator identifier-regulator variable
- regulatory_truth_table - a table with the possible coefficients of the target variable according to the regulatory events and regulators’ coefficients
and the following methods:
add_target(target)
- add the target to the interaction. It removes the current target.remove_target(target)
- remove the target from the interaction.add_regulatory_event(coefficient, expression)
- add a new regulatory event for a target coefficient. It removes the current coefficient if available.remove_regulatory_event(coefficient)
- remove the regulatory event for the target coefficient.
Targets have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- coefficients - the target coefficients; all possible values that a target can take during expression evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- interaction - the target interaction.
- regulators - target regulators; a dictionary of regulator identifier-regulator variable
and the following methods:
ko()
- target deletion; it sets the coefficients to zero
Regulators have the following attributes:
- identifier - id of the variable
- name - name of the variable
- aliases - aliases of the variable
- coefficients - the regulator coefficients; all possible values that a regulator can take during expression evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- interactions - regulator interactions; a dictionary of interaction identifier-interaction variable
- targets - regulator targets; a dictionary of target identifier-target variable
and the following methods:
ko()
- regulator deletion; it sets the coefficients to zero
Bold-italicized properties can be set with new values (e.g., regulator.coefficients = (1,)
).
# inspecting an interaction
sdhc_interaction = model.get('b0721_interaction')
sdhc_interaction
Identifier | b0721_interaction |
---|---|
Name | b0721_interaction |
Aliases | b0721 |
Model | e_coli_core |
Types | interaction |
Target | b0721 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261) |
Regulators | b4401, b1334, b3357, b3261 |
Regulatory events | 1 = (( ~ (b4401 | b1334)) | b3357 | b3261) |
The regulatory truth table is a table with the possible coefficients of the target variable according to the regulatory events and regulators’ coefficients.
# inspecting the regulatory truth table
sdhc_interaction.regulatory_truth_table
result | b4401 | b1334 | b3357 | b3261 | |
---|---|---|---|---|---|
b0721 | 0 | NaN | NaN | NaN | NaN |
b0721 | 1 | 1.0 | 1.0 | 1.0 | 1.0 |
One can create Interactions, Targets and Regulators using the objects mentioned above.
# imports
from mewpy.germ.algebra import Expression, parse_expression
from mewpy.germ.variables import Target, Interaction, Regulator
# creating the regulators
b0001 = Regulator(identifier='b0001', name='thrL', coefficients=(0, 1))
b0002 = Regulator(identifier='b0002', name='thrA', coefficients=(0, 1))
# creating the target
b0003 = Target(identifier='b0003', name='thrB', coefficients=(0, 1))
# creating a regulatory event
b0003_expression = Expression(symbolic=parse_expression('b0002 and not b0001'),
variables={'b0001': b0001, 'b0002': b0002})
# creating the interaction
# it is always a good practice to build the expression of a given interaction first, and then use it in the
# Interaction constructor. Otherwise, interaction has alternative constructors (from_expression or from_string)
b0003_interaction = Interaction(identifier='interaction_b0003',
regulatory_events={1.0: b0003_expression},
target=b0003)
b0003_interaction
Identifier | interaction_b0003 |
---|---|
Name | |
Aliases | |
Model | None |
Types | interaction |
Target | b0003 || 1.0 = (b0002 & ( ~ b0001)) |
Regulators | b0001, b0002 |
Regulatory events | 1.0 = (b0002 & ( ~ b0001)) |
Interactions can be created automatically from a regulatory rule in a string format. This avoids creating regulatory expressions manually using the boolean expression parser. Note that, Regulators are also created automatically using the identifiers in the string.
b0004 = Target(identifier='b0004')
# creating an interaction from string. Note that propositional logic is also accepted
b0004_interaction = Interaction.from_string(identifier='b0004_interaction',
name='interaction from string creates new genes',
rule='(b0005 and b0006) or (b0007 > 0)',
target=b0004)
b0004_interaction
Identifier | b0004_interaction |
---|---|
Name | interaction from string creates new genes |
Aliases | |
Model | None |
Types | interaction |
Target | b0004 || 1.0 = ((b0005 & b0006) | (b0007 > 0)) |
Regulators | b0005, b0006, b0007 |
Regulatory events | 1.0 = ((b0005 & b0006) | (b0007 > 0)) |
One can change the outcome of a regulatory expression by changing the coefficients of the regulators.
# changing the regulatory expression by altering the regulators coefficients
b0005 = b0004_interaction.regulators['b0005']
b0005.coefficients = (0,)
b0007 = b0004_interaction.regulators['b0007']
b0007.coefficients = (0,)
b0004_interaction.regulatory_truth_table
b0005 | b0006 | b0007 | result | |
---|---|---|---|---|
b0004 | 0 | 1.0 | 0 | 0 |
It is also possible to evaluate the regulatory expression with different coefficients without changing the regulators’ coefficients.
# evaluating the regulatory expression with different regulators coefficients
# (it does not change the regulators coefficients though)
b0004_expression = b0004_interaction.regulatory_events.get(1)
b0004_expression.evaluate(values={'b0005': 1})
1
A GERM model variable is by default a multi-type variable. Integrated models often include multi-type variables representing simultaneously regulators and metabolites or targets and metabolic genes, among others. A single GERM model variable can store the information of a multi-type variable. For instance, a single variable object can share attributes and methods of a metabolite and regulator. More importantly, genes associated with reactions in a metabolic model often correspond to target genes having a regulatory interaction in the regulatory model.
MEWpy builds multi-type variables when reading GERM models.
One can check variable.types
or use type checkers, such as variable.is_regulator()
.
# access to a reaction and find all regulators associated
pdh = model.get('PDH')
pdh_regulators = []
for gene in pdh.yield_genes():
if gene.is_target():
pdh_regulators.extend(gene.yield_regulators())
print('PDH regulators: ', ', '.join(reg.id for reg in pdh_regulators))
PDH regulators: b0113, b3261, b0113, b3261
from mewpy.germ.variables import Variable
# one can create multi-type variables as follows
Variable.from_types(types=('target', 'gene'), identifier='b0001')
Working with GERM model analysis¶
In the mewpy.germ.analysis
module, simulation methods are derived from LinearProblem
.
A phenotype simulation method includes the following attributes:
method
- the name of the simulation methodmodel
- the model used to build the linear problemsolver
- a MEWpy solver instance having the linear programming implementation of variables and constraints in the selected solver. The following solvers are available: CPLEX; GUROBI; OPTLANGconstraints
- The representation of ODE to be implemented in the solver instance using linear programmingvariables
- The representation of the system variables to be implemented in the solver instance using linear programmingobjective
- A linear representation of the objective function associated with the linear problem
And the following methods:
build
- the build method is responsible for retrieving variables and constraints from a GERM model according to the mathematical formulation of each simulation methodoptimize
- the optimize method is responsible for solving the linear problem using linear programming or mixed-integer linear programming. This method accepts method-specific arguments (initial state, dynamic, etc) and solver-specific arguments (linear, minimize, constraints, get_values, etc). These arguments can override temporarily some constraints or variables during the optimization.
from mewpy.io import Reader, Engines, read_model
from mewpy.germ.analysis import SRFBA
# reading the E. coli core model
core_gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
core_trn_reader = Reader(Engines.BooleanRegulatoryCSV,
'e_coli_core_trn.csv', sep=',', id_col=0, rule_col=2, aliases_cols=[1], header=0)
model = read_model(core_gem_reader, core_trn_reader)
# initialization does not build the model automatically
srfba = SRFBA(model).build()
srfba
Method | SRFBA |
Model | Model e_coli_core - E. coli core model - Orth et al 2010 |
Variables | 486 |
---|---|
Constraints | 326 |
Objective | {'Biomass_Ecoli_core': 1.0} |
Solver | CplexSolver |
Synchronized | True |
The optimize
interface creates a ModelSolution
output by default containing the objective value,
value of each variable in the solution, among others.
Alternatively, optimize
can create a simple solver Solution
object.
# optimization creates a ModelSolution object by default
solution = srfba.optimize()
solution
Method | SRFBA |
Model | Model e_coli_core - E. coli core model - Orth et al 2010 |
Objective | Biomass_Ecoli_core |
---|---|
Objective value | 0.8739215069684986 |
Status | optimal |
One can generate a pandas DataFrame
using the to_frame()
method of the ModelSolution
object.
This DataFrame
includes coefficients of regulatory environmental stimuli linked to the exchange fluxes.
# a solution can be converted into a df
solution.to_frame()
One can generate a Summary
object using the to_summary()
method of the ModelSolution
object.
This summary contains the following data:
inputs
- regulatory and metabolic inputs of the solutionoutputs
- regulatory and metabolic inputs of the solutionmetabolic
- values of the metabolic variablesregulatory
- values of the regulatory variablesobjective
- the objective valuedf
- the summary of inputs and outputs in the regulatory and metabolic layers
# a solution can be converted into a summary solution
summary = solution.to_summary()
# inputs + outputs of the regulatory-metabolic variables
summary.df
# values of the metabolic variables
summary.metabolic
# values of the regulatory variables
summary.regulatory
All phenotype simulation methods have a fast one-line code function to run the simulation. These functions return only the objective value of the solution.
from mewpy.germ.analysis import slim_fba
# using slim FBA analysis
slim_fba(model)
0.8739215069684303
MEWpy phenotype simulation using GERM models¶
FBA
and pFBA
are also available in MEWpy’s Simulator
,
which is the common interface to perform simulations using GERM models, COBRApy models, and Reframed models
(see Phenotype Simulation section).
from mewpy.simulation import get_simulator
# using MEWpy simulator
simulator = get_simulator(model)
simulator.simulate()
objective: 0.8739215069684303
Status: OPTIMAL
Constraints: OrderedDict()
Method: SimulationMethod.FBA
GERM model and phenotype simulation workflow¶
A phenotype simulation method must be initialized with a GERM model. A common workflow to work with GERM models and simulation methods is suggested as follows:
model = read_model(reader1, reader2)
- read the modelrfba = RFBA(model)
- initialize the simulation methodrfba.build()
- build the linear problemsolution = rfba.optimize()
- perform the optimizationmodel.reactions['MY_REACTION'].bounds = (0, 0)
- make changes to the modelsolution = RFBA(model).build().optimize()
- initialize, build and optimize the simulation method
In this workflow, model and rfba instances are not synchronized, as rfba’s optimization will generate the same output even if we make changes to the model.
To address the latest changes in the model, one must initialize, build and optimize RFBA
again.
A different workflow consists of attaching phenotype simulation methods to GERM models as follows:
model = read_model(reader1, reader2)
- read the modelrfba = RFBA(model, attach=True)
- initialize the simulation method and attach it to the modelrfba.build()
- build the linear problemsolution = rfba.optimize()
- perform the optimizationmodel.reactions['MY_REACTION'].bounds = (0, 0)
- make changes to the modelrxn_ko_solution = rfba.optimize()
- perform the optimization again but this time with the reaction deletion
The second workflow is simpler and should be used for minor changes in the model. In this workflow, rfba’s optimizations will use the latest state of the model.
# First workflow: build and optimize
srfba = SRFBA(model).build()
solution = srfba.optimize()
print('Wild-type growth rate', solution.objective_value)
# making changes (temporarily) and then build, optimize
with model:
model.regulators['b3261'].ko()
srfba = SRFBA(model).build()
solution = srfba.optimize()
print('KO growth rate', solution.objective_value)
Wild-type growth rate 0.8739215069684303
KO growth rate 1e-10
# second workflow build and optimize
srfba = SRFBA(model, attach=True).build()
solution = srfba.optimize()
print('Wild-type growth rate', solution.objective_value)
# applying the knockout and optimize (no build required)
with model:
model.regulators['b3261'].ko()
solution = srfba.optimize()
print('KO growth rate', solution.objective_value)
Wild-type growth rate 0.8739215069684986
KO growth rate 1e-10
In addition, one can attach as many simulation methods as needed to a single model instance. This behavior eases the comparison between simulation methods.
# many simulation methods attached
fba = FBA(model, attach=True).build()
pfba = pFBA(model, attach=True).build()
rfba = RFBA(model, attach=True).build()
# applying the knockout
model.regulators['b3261'].ko()
print('FBA KO growth rate:', fba.optimize().objective_value)
print('pFBA KO sum of fluxes:', pfba.optimize().objective_value)
print('RFBA KO growth rate:', rfba.optimize().objective_value)
print('SRFBA KO growth rate:', srfba.optimize().objective_value)
print()
# restore the model
model.undo()
print('FBA WT growth rate:', fba.optimize().objective_value)
print('pFBA WT sum of fluxes:', pfba.optimize().objective_value)
print('RFBA WT growth rate:', rfba.optimize().objective_value)
print('SRFBA WT growth rate:', srfba.optimize().objective_value)
FBA KO growth rate: 0.8739215069684303
pFBA KO sum of fluxes: 93768.8478640836
RFBA KO growth rate: 0.8513885233462081
SRFBA KO growth rate: 1e-10
FBA WT growth rate: 0.8739215069684303
pFBA WT sum of fluxes: 93768.8478640836
RFBA WT growth rate: 0.8513885233462081
SRFBA WT growth rate: 0.8739215069684986
FBA and pFBA¶
FBA
and pFBA
are both available in the mewpy.germ.analysis
package.
Alternatively, one can use the simple and optimized versions slim_fba
and slim_pfba
.
from mewpy.io import Reader, Engines, read_model
from mewpy.germ.analysis import FBA
# reading the E. coli core model
core_gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
model = read_model(core_gem_reader)
# using FBA analysis
FBA(model).build().optimize()
Method | FBA |
Model | Model e_coli_core - E. coli core model - Orth et al 2010 |
Objective | Biomass_Ecoli_core |
---|---|
Objective value | 0.8739215069684303 |
Status | optimal |
# using pFBA analysis
pFBA(model).build().optimize().objective_value
93768.8478640836
FVA and deletions¶
The mewpy.germ.analysis
module includes FVA
method to inspect the solution space of a GEM model.
This module also includes single_gene_deletion
and single_reaction_deletion
methods to inspect
in silico genetic strategies.
These methods perform an FBA phenotype simulation of a single reaction deletion or gene knockout
for all reactions and genes in the metabolic model.
These methods are faster than iterating through the model reactions or genes using the ko()
method.
from mewpy.io import Reader, Engines, read_model
from mewpy.germ.analysis import fva, single_gene_deletion, single_reaction_deletion
# reading the E. coli core model
core_gem_reader = Reader(Engines.MetabolicSBML, 'e_coli_core.xml')
model = read_model(core_gem_reader)
# FVA returns the DataFrame with minium and maximum values of each reaction
fva(model)
# single reaction deletion
single_reaction_deletion(model)
# single gene deletion for specific genes
single_gene_deletion(model, genes=model.reactions['ACONTa'].genes)
growth | status | |
---|---|---|
b0118 | 0.873922 | Optimal |
b1276 | 0.873922 | Optimal |
Regulatory truth table¶
The regulatory truth table evaluates all regulatory interactions using their regulatory events (expressions).
A regulatory expression is a boolean expression that evaluates to 1
or 0
depending on the regulatory state.
mewpy.germ.analysis.regulatory_truth_table
creates a pandas DataFrame
having regulators’ values in the columns
and targets’ outcome in the index.
from mewpy.io import Reader, Engines, read_model
from mewpy.germ.analysis import regulatory_truth_table
# reading the E. coli core model
core_trn_reader = Reader(Engines.BooleanRegulatoryCSV,
'e_coli_core_trn.csv', sep=',', id_col=0, rule_col=2, aliases_cols=[1], header=0)
model = read_model(core_trn_reader)
# regulatory truth table for the regulatory model
model = read_model(core_trn_reader)
regulatory_truth_table(model)
RFBA¶
RFBA
is a phenotype simulation method based on the integration of a GEM model with a TRN at the genome-scale.
RFBA
performs first a synchronous evaluation of all regulatory interactions in the regulatory model.
This simulation is used to retrieve the regulatory state (regulators’ coefficients).
Then, the regulatory state is translated into a metabolic state (metabolic genes’ coefficients)
by performing a second synchronous evaluation of all regulatory interactions in the regulatory model.
Finally, the resulting metabolic state is used to decode metabolic constraints upon evaluation of the reactions’ GPRs
with the targets’ state.
RFBA
supports steady-state or dynamic phenotype simulations.
Dynamic RFBA
simulation performs sequential optimizations while the regulatory state is updated each time
using the reactions and metabolites coefficients of the previous optimization.
Dynamic RFBA
simulation stops when two identical solutions are found.
RFBA
is available in the mewpy.germ.analysis
package.
Alternatively, one can use the simple and optimized version slim_rfba
.
For more details consult: https://doi.org/10.1038/nature02456.
In this example we will be using E. coli iMC1010 model available at examples/models/germ/iJR904_srfba.xml and examples/models/germ/iMC1010.csv
The integrated E. coli iMC1010 model was published by Covert et al, 2004. This model consists of the E. coli iJR904 GEM model published by Reed et al, 2003 and E. coli iMC1010 TRN published by Covert et al, 2004. This model includes 904 metabolic genes, 931 unique biochemical reactions, and a TRN having 1010 regulatory interactions (target-regulators using boolean logic).
from mewpy.io import Reader, Engines, read_model
# loading E. coli iMC1010 model
imc1010_gem_reader = Reader(Engines.MetabolicSBML, 'iJR904.xml')
imc1010_trn_reader = Reader(Engines.BooleanRegulatoryCSV,
'iMC1010.csv', sep=',', id_col=0, rule_col=4, aliases_cols=[1, 2, 3], header=0)
model = read_model(imc1010_gem_reader, imc1010_trn_reader)
model
Model | iJR904 |
---|---|
Name | Reed2003 - Genome-scale metabolic network of Escherichia coli (iJR904) |
Types | regulatory, metabolic |
Compartments | e, c |
Reactions | 1083 |
Metabolites | 768 |
Genes | 904 |
Exchanges | 150 |
Demands | 0 |
Sinks | 0 |
Objective | BiomassEcoli |
Regulatory interactions | 1010 |
Targets | 1010 |
Regulators | 232 |
Regulatory reactions | 22 |
Regulatory metabolites | 96 |
Environmental stimuli | 11 |
RFBA
can be simulated using an initial regulatory state that will be used
during synchronous evaluation of all regulatory interactions.
However, setting up the regulators’ initial state is a difficult task.
Most of the time, the initial state is not known and hinders feasible solutions during simulation.
If the initial state is not provided to RFBA
, this method will consider that all regulators are active.
This initial state is clearly not the best, as many essential reactions can be switched off.
To relax some constraints, the initial state of a regulatory metabolite is inferred from its exchange reaction, namely the absolute value of the lower bound. Likewise, the initial state of a regulatory reaction is inferred from its upper bound. Even so, this initial state is likely to yield infeasible solutions.
Find conflicts¶
To mitigate conflicts between the regulatory and metabolic states,
one can use the mewpy.germ.analysis.find_conflicts()
method.
This method can find regulatory states that lead to knockouts of essential genes and deletion of essential reactions.
Note that, find_conflicts()
results should be carefully analyzed, as this method does not detect indirect conflicts.
Please consult the method for more details and the example bellow.
from mewpy.germ.analysis import find_conflicts
# we can see that 3 regulators are affecting the following essential genes: b2574; b1092; b3730
repressed_genes, repressed_reactions = find_conflicts(model)
repressed_genes
interaction | Stringent | b0676 | b4390 | |
---|---|---|---|---|
b1092 | b1092 || 1 = ( ~ Stringent) | 1.0 | NaN | NaN |
b3730 | b3730 || 1 = b0676 | NaN | 0.0 | NaN |
b2574 | b2574 || 1 = ( ~ b4390) | NaN | NaN | 1.0 |
find_conflicts()
suggests that three essential genes (b2574; b1092; b3730) are being affected
by three regulators (b4390, Stringent, b0676). However, some regulators do not affect growth directly,
as they are being regulated by other regulators, environmental stimuli, metabolites and reactions.
# regulator-target b4390 is active in high-NAD conditions (environmental stimuli)
model.get('b4390')
Identifier | b4390 |
---|---|
Name | b4390 |
Aliases | nadR, b4390, NadR, nadr |
Model | iJR904 |
Types | regulator, target |
Coefficients | (0.0, 1.0) |
Active | True |
Interactions | b0931_interaction, b2574_interaction |
Targets | b0931, b2574 |
Environmental stimulus | False |
Interaction | b4390 || 1 = high-NAD |
Regulators | high-NAD |
Now, we can infer a feasible initial state for the model and run RFBA
.
from mewpy.germ.analysis import RFBA
# initial state inferred from the find_conflicts method.
initial_state = {
'Stringent': 0.0,
'high-NAD': 0.0,
'AGDC': 0.0,
}
# steady-state RFBA
rfba = RFBA(model).build()
solution = rfba.optimize(initial_state=initial_state)
solution
Method | RFBA |
Model | Model iJR904 - Reed2003 - Genome-scale metabolic network of Escherichia coli (iJR904) |
Objective | BiomassEcoli |
---|---|
Objective value | 0.8517832811766279 |
Status | optimal |
SRFBA¶
SRFBA
is a phenotype simulation method based on the integration of a GEM model with a TRN at the genome-scale.
SRFBA
performs a single steady-state simulation using both metabolic and regulatory constraints found in the integrated model.
This method uses Mixed-Integer Linear Programming (MILP) to solve nested boolean algebra expressions formulated from
the structure of the regulatory layer (regulatory interactions) and metabolic layer (GPR rules).
For that, SRFBA
adds auxiliary variables representing intermediate boolean variables and operators.
The resulting linear problem also includes a boolean variable and constraint for each reaction linking the outcome
of the interactions and GPR constraints to the mass balance constraints.
SRFBA
only supports steady-state simulations.
SRFBA
is available in the mewpy.germ.analysis
package.
Alternatively, one can use the simple and optimized version slim_srfba
.
For more details consult: https://doi.org/10.1038%2Fmsb4100141.
In this example we will be using E. coli iMC1010 model available at models/germ/iJR904_srfba.xml and models/germ/iMC1010.csv. This is the model used in the RFBA example.
from mewpy.io import Reader, Engines, read_model
# loading E. coli iMC1010 model
imc1010_gem_reader = Reader(Engines.MetabolicSBML, 'iJR904.xml')
imc1010_trn_reader = Reader(Engines.BooleanRegulatoryCSV,
'iMC1010.csv', sep=',', id_col=0, rule_col=4, aliases_cols=[1, 2, 3], header=0)
model = read_model(imc1010_gem_reader, imc1010_trn_reader)
SRFBA
does not need an initial state in most cases, as this method can perform a steady-state simulation using MILP.
The solver tries to find a regulatory state favoring reactions that contribute to faster growth rates.
Accordingly, regulatory variables can take values between zero and one.
from mewpy.analysis import SRFBA
# steady-state SRFBA
srfba = SRFBA(model).build()
solution = srfba.optimize()
solution
Method | SRFBA |
Model | Model iJR904 - Reed2003 - Genome-scale metabolic network of Escherichia coli (iJR904) |
Objective | BiomassEcoli |
---|---|
Objective value | 0.8218562176868295 |
Status | optimal |
iFVA and iDeletions¶
The mewpy.germ.analysis
module includes an integrated version of the FVA
method named iFVA
.
This method can be used to inspect the solution space of an integrated GERM model.
iFVA
computes the minimum and maximum possible fluxes of each reaction in a metabolic model
using one of the integrated analysis mentioned above (RFBA
or SRFBA
).
This method return a pandas DataFrame
with the minium and maximum fluxes (columns) for each reaction (index).
The mewpy.germ.analysis
module also includes isingle_gene_deletion
, isingle_reaction_deletion
,
and isingle_regulator_deletion
methods to inspect in silico genetic strategies in integrated GERM models.
from mewpy.io import Reader, Engines, read_model
from mewpy.germ.analysis import ifva
# loading E. coli iMC1010 model
imc1010_gem_reader = Reader(Engines.MetabolicSBML, 'iJR904.xml')
imc1010_trn_reader = Reader(Engines.BooleanRegulatoryCSV,
'iMC1010.csv', sep=',', id_col=0, rule_col=4, aliases_cols=[1, 2, 3], header=0)
model = read_model(imc1010_gem_reader, imc1010_trn_reader)
# iFVA of the first fifteen reactions using srfba (the default method). Fraction inferior to 1 (default) to relax the constraints
reactions_ids = list(model.reactions)[:15]
ifva(model, fraction=0.9, reactions=reactions_ids, method='srfba')
minimum | maximum | |
---|---|---|
12PPDt | 0.000000e+00 | 0.000000 |
2DGLCNRx | 0.000000e+00 | 0.000000 |
2DGLCNRy | 0.000000e+00 | 0.000000 |
2DGULRx | 0.000000e+00 | 0.000000 |
2DGULRy | 0.000000e+00 | 0.000000 |
3HCINNMH | 0.000000e+00 | 0.000000 |
3HPPPNH | 0.000000e+00 | 0.000000 |
4HTHRS | 0.000000e+00 | 0.000000 |
5DGLCNR | -1.319152e+00 | 0.000000 |
A5PISO | 3.106617e-02 | 0.034518 |
AACPS1 | -7.177128e-12 | 0.055426 |
AACPS2 | -1.794282e-11 | 0.138565 |
AACPS3 | -1.291883e-10 | 0.997670 |
AACPS4 | -2.511995e-11 | 0.193991 |
AACPS5 | -1.794282e-10 | 1.385652 |
PROM¶
PROM
is a probabilistic-based phenotype simulation method.
This method circumvents discrete constraints created by RFBA
and SRFBA
using a continuous approach:
reactions’ constraints are proportional to the probabilities of related genes being active.
PROM
performs a single steady-state simulation using the probabilistic-based constraints to limit flux through some reactions.
This method cannot perform wild-type phenotype simulations though, as probabilities are calculated for single regulator deletion.
Hence, PROM
is adequate to predict the effect of regulator perturbations.
PROM
can generate a KOSolution
containing the solution of each regulator knock-out.
PROM
is available in the mewpy.germ.analysis
package.
Alternatively, one can use the simple version mewpy.germ.analysis.slim_prom
.
For more details consult: https://doi.org/10.1073/pnas.1005139107.
In this example, we will be using M. tuberculosis iNJ661 model available at examples/models/germ/iNJ661.xml, examples/models/germ/iNJ661_trn.csv, and examples/models/germ/iNJ661_gene_expression.csv.
The integrated M. tuberculosis iNJ661 model was published by Chandrasekaran et al, 2010. This model consists of the M. tuberculosis iNJ661 GEM model published by Jamshidi et al, 2007, M. tuberculosis TRN published by Balazsi et al, 2008, and gene expression dataset published by Chandrasekaran et al, 2010. This model includes 691 metabolic genes, 1028 unique biochemical reactions, and a TRN having 2018 regulatory interactions (target-regulator).
from mewpy.io import Reader, Engines, read_model
# loading M. tuberculosis iNJ661 model
inj661_gem_reader = Reader(Engines.MetabolicSBML, 'iNJ661.xml')
inj661_trn_reader = Reader(Engines.TargetRegulatorRegulatoryCSV,
'iNJ661_trn.csv', sep=';', target_col=0, regulator_col=1, header=None)
model = read_model(inj661_gem_reader, inj661_trn_reader)
model
Model | iNJ661 |
---|---|
Name | M. tuberculosis iNJ661 model - Jamshidi et al 2007 |
Types | regulatory, metabolic |
Compartments | c, e |
Reactions | 1028 |
Metabolites | 828 |
Genes | 661 |
Exchanges | 88 |
Demands | 0 |
Sinks | 0 |
Objective | biomass_Mtb_9_60atp_test_NOF |
Regulatory interactions | 178 |
Targets | 178 |
Regulators | 30 |
Regulatory reactions | 0 |
Regulatory metabolites | 0 |
Environmental stimuli | 29 |
Infer probabilities¶
PROM
phenotype simulation requires an initial state that must be inferred from the TRN and gene expression dataset.
Besides, the format of the initial state is slightly different from RFBA
and SRFBA
initial states.
PROM
’s initial state must be a dictionary in the following format:
- keys -> tuple of regulator and target gene identifiers
- value -> probability of this regulatory interaction inferred from the gene expression dataset
mewpy.omics
package contains the required methods to perform a quantile preprocessing of the gene expression dataset.
Then, one can use the mewpy.germ.analysis.prom.target_regulator_interaction_probability()
method to infer PROM
’s initial state
from mewpy.omics import ExpressionSet
from mewpy.germ.analysis import target_regulator_interaction_probability
# computing PROM target-regulator interaction probabilities using quantile preprocessing pipeline
expression = ExpressionSet.from_csv(file_path='iNJ661_gene_expression.csv', sep=';', index_col=0, header=None)
quantile_expression, binary_expression = expression.quantile_pipeline()
initial_state, _ = target_regulator_interaction_probability(model,
expression=quantile_expression,
binary_expression=binary_expression)
initial_state
...
('Rv2920c', 'Rv3575c'): 1,
('Rv3275c', 'Rv3575c'): 1,
('Rv3275c', 'Rv3676'): 0.7416666666666667,
('Rv3276c', 'Rv3575c'): 0.5045317220543807,
('Rv3276c', 'Rv3676'): 1,
('Rv0408', 'Rv3676'): 0.55,
...
Now we can perform the PROM
simulation.
from mewpy.germ.analysis import PROM
# using PROM
prom = PROM(model).build()
solution = prom.optimize(initial_state=initial_state)
solution.solutions
{'ko_Rv0001': PROM Solution
Objective value: 0.028300772436182654
Status: optimal,
'ko_Rv3575c': PROM Solution
Objective value: 0.052199202493402735
Status: optimal,
'ko_Rv3676': PROM Solution
Objective value: 0.03117434871202209
Status: optimal,
...
CoRegFlux¶
CoRegFlux
is a linear regression-based phenotype simulation method.
This method circumvents discrete constraints created by RFBA
and SRFBA
using a continuous approach:
reactions’ constraints are proportional (using soft plus activation function) to the predicted expression of related genes.
CoRegFlux
performs a single steady-state simulation using predicted gene expression data (estimated with a linear regression model)
to limit flux through some reactions.
Hence, this method can predict the phenotypic behavior of an organism for all environmental conditions of the gene expression dataset.
However, this method must use a different training dataset to infer regulators’ influence scores and train the linear regression models.
CoRegFlux
can also perform dynamic simulations for a series of time steps.
At each time step, dynamic CoRegFlux
updates metabolite concentrations and biomass yield using the euler function.
These values are then translated into additional constraints to be added to the steady-state simulation.
CoRegFlux
can generate a ModelSolution
containing the solution for a single environmental condition in the experiment dataset.
In addition, CoRegFlux
can generate a DynamicSolution
containing time-step solutions for a single environmental condition in the experiment dataset.
CoRegFlux
is available in the mewpy.germ.analysis
package.
Alternatively, one can use the simple version slim_coregflux
.
For more details consult: https://doi.org/10.1186/s12918-017-0507-0.
In this example we will be using the following models and data:
- S. cerevisae iMM904 model available at examples/models/germ/iMM904.xml,
- S. cerevisae TRN inferred with CoRegNet and available at examples/models/germ/iMM904_trn.csv,
- S. cerevisae training gene expression dataset available at examples/models/germ/iMM904_gene_expression.csv,
- S. cerevisae influence scores inferred with CoRegNet in the gene expression dataset available at examples/models/germ/iMM904_influence.csv,
- S. cerevisae experiments gene expression dataset available at examples/models/germ/iMM904_experiments.csv.
The integrated S. cerevisae iMM904 model was published by Banos et al, 2017. This model consists of the S. cerevisae iMM904 GEM model published by Mo et al, 2009, S. cerevisae TRN inferred by CoRegNet published by Nicolle et al, 2015, and gene expression datasets published by Brauer et al, 2005 and DeRisi et al, 1997. This model includes 904 metabolic genes, 1557 unique biochemical reactions, and a TRN having 3748 regulatory interactions (target-regulators separated in co-activators and co-repressors).
from mewpy.io import Reader, Engines, read_model
# loading S. cerevisae iMM904 model
imm904_gem_reader = Reader(Engines.MetabolicSBML, 'iMM904.xml')
imm904_trn_reader = Reader(Engines.CoExpressionRegulatoryCSV,
'iMM904_trn.csv', sep=',', target_col=2, co_activating_col=3, co_repressing_col=4, header=0)
model = read_model(imm904_gem_reader, imm904_trn_reader)
model
Model | iMM904 |
---|---|
Name | S. cerevisae iMM904 model - Mo et al 2009 |
Types | regulatory, metabolic |
Compartments | c, e, m, x, r, v, g, n |
Reactions | 1577 |
Metabolites | 1226 |
Genes | 905 |
Exchanges | 164 |
Demands | 0 |
Sinks | 0 |
Objective | BIOMASS_SC5_notrace |
Regulatory interactions | 3748 |
Targets | 3748 |
Regulators | 201 |
Regulatory reactions | 0 |
Regulatory metabolites | 0 |
Environmental stimuli | 199 |
Predicting gene expression¶
CoRegFlux
phenotype simulation requires an initial state that must be inferred from the TRN, gene expression dataset,
influence score matrix and experiments gene expression dataset.
This initial state contains the predicted gene expression of target metabolic genes available in the GEM model.
mewpy.germ.analysis.coregflux
module includes the tools to infer CoRegFlux
’s initial state.
These methods create the linear regression models to predict targets’ expression according to the experiments gene expression dataset.
One just have to load expression, influence and experiments CSV files using mewpy.omics.ExpressionSet
.
HINT: the predict_gene_expression
method might be time-consuming for some gene expression datasets.
One can save the predictions into a CSV file and then load it afterwards using mewpy.omics.ExpressionSet.from_csv()
.
from mewpy.omics import ExpressionSet
from mewpy.germ.analysis import predict_gene_expression
# HINT: you can uncomment the following line to load pre-computed gene expression predictions.
# Do not forget to comment the remaining lines in this cell.
# gene_expression_prediction = read_gene_expression_dataset(path.joinpath('iMM904_gene_expression_prediction.csv'),
# sep=',', gene_col=0, header=0)
expression = ExpressionSet.from_csv('iMM904_gene_expression.csv', sep=';', index_col=0, header=0).dataframe
influence = ExpressionSet.from_csv('iMM904_influence.csv', sep=';', index_col=0, header=0).dataframe
experiments = ExpressionSet.from_csv('iMM904_experiments.csv', sep=';', index_col=0, header=0).dataframe
gene_expression_prediction = predict_gene_expression(model=model, influence=influence, expression=expression,
experiments=experiments)
Now we can perform CoRegFlux
simulation.
from mewpy.germ.analysis import CoRegFlux
# steady-state simulation only requires the initial state of a given experiment (the first experiment in this case)
initial_state = list(gene_expression_prediction.to_dict().values())
co_reg_flux = CoRegFlux(model).build()
solution = co_reg_flux.optimize(initial_state=initial_state[0])
solution
Method | CoRegFlux |
Model | Model iMM904 - S. cerevisae iMM904 model - Mo et al 2009 |
Objective | BIOMASS_SC5_notrace |
---|---|
Objective value | 0.28786570373177145 |
Status | optimal |
Options¶
MEWpy makes available a large set of options, some being globally defined in mewpy.util.constants.
Number of processors for parallel solutions evaluation
By default, MEWpy uses half of the available treads to run parallel evaluations. However, a user may define the number of parallel threads by altering the NUM_CPUS
constant in mewpy.util.constants
:
from mewpy.util.constants import EAConstants
# uses 32 parallel threads
EAConstants.NUM_CPUS = 32
Over and under expression folds.
Over- and under-expression optimization problems have a set of possible folds globally defined. It is possible, however, to define such folds directly in a problem definition, for example, in a genes’ under expression problem:
# the model has already been loaded
# Define the regulation folds for under-expression or deletion.
# 0 for deletion.
levels = [1/8,1/4,1/2,0]
from mewpy.problems import GOUProblem
problem = GOUProblem(model,levels=levels)
Number of modifications
The minimum and the maximum number of modifications may be defined directly in the problem definition. For example to allow a maximum of 6 gene deletions:
from mewpy.problems import GKOProblem
problem = GKOProblem(model,candidate_max_size=6)
Likewise the minimum number of modifications may be explicitly defined:
from mewpy.problems import GKOProblem
problem = GKOProblem(model,candidate_min_size=4,candidate_max_size=6)
The default minimum and maximum number of modifications are 1 and 10 respectively. When both the minimum and the maximum number of modifications are equal, all solutions will have the same number of modifications.
Optimization algorithm
MEWpy resorts to Inspyred and JMetalPy packages to evolve modification solutions. If both packages are installed, MEWpy uses Inpyred by default, running the Non-dominated Sorting Genetic Algorithm (NSGA-II) for multi-objective optimizations and a Genetic Algorithm (GA) for single objective problems. To alter the engine preference to JMetalPy the following must be added to your script:
from mewpy.optimization import set_default_engine
set_default_engine('jmetal')
Also, MEWpy allows to define the optimization algorithm when configuring the EA for the addressed optimization problem:
ea = EA(problem, max_generations=ITERATIONS, algorithm='NSGAIII')
Note that when using a single objective, only Simulated Annealing (SA) and Genetic Algorithm (GA) are allowed. Any other configuration will be ignored. The same is true when choosing single objective algorithms to solve multi objective problems. To run multi objective problems using a single objective algorithms please refer to the AggregatedSum evaluation function.
Seeding an EA with an initial population.
The EAs may be seeded with a list of solutions, to guide the optimization or to give it a push start.
For deletion problems, the initial population is a list of solutions, represented as lists of modification targets. For example, for a reaction knock out problem, the initial population would be of the sort:
init_pop = [['R_1', 'R_10'],
['R_3', 'R_2', 'R_5'],
...
]
ea = EA(problem,initial_population=init_pop)
where each'R_i'
is a reaction on the modification target list. The modification target list, when not explicitly provided, can be retrieve from the problem instance:
problem.target_list
For over-/under-regulation optimization problems, the initial population is a list of dictionaries:
init_pop = [{'R_1':2, 'R_10':0},
{'R_3':8, 'R_2':0.5, 'R_5':0},
...
]
ea = EA(problem,initial_population=init_pop)
where each item is of the formmodification_target: fold_level
, where the folds levels are values in the list of allowed expression levels.
Simplification of solutions
By default, MEWpy simplifies the final set of solutions by removing genetic modifications that do not alter any of the of the optimization objectives from solutions. This behavior, which is time consuming, may be altered by setting the simplify flag to false when running the EA:
ea.run(simplify=False)
src¶
mewpy package¶
Subpackages¶
mewpy.io package¶
Submodules¶
mewpy.io.bnet module¶
mewpy.io.sbml module¶
Load ODE model
mewpy.io.sbml_qual module¶
mewpy.io.tabular module¶
Module contents¶
-
mewpy.io.
read_cbmodel
(io: Union[Cobra_Model, Reframed_Model], cobrapy: bool = True, reframed: bool = False, warnings: bool = True) → Union[Model, MetabolicModel][source]¶ Reading a mewpy metabolic model encoded into a Constraint-Based metabolic model from Cobrapy or Reframed. It can only return a metabolic model from the cobra model.
Only one cobra model. The cobra model platform can be explicitly set using the cobrapy (default) or reframed flags. Consult the Engines enumerator in mewpy.io.engines for further detail on this cobra model types.
Parameters: - io – A valid cobra model.
- cobrapy – Whether the cobra model is a cobrapy model
- reframed – Whether the cobra model is a reframed model
- warnings – Whether to launch warnings found during reading
Returns: mewpy metabolic model
-
mewpy.io.
read_csv
(io: Union[str, pathlib.Path, TextIOWrapper], boolean: bool = True, co_expression: bool = False, target_regulator: bool = False, warnings: bool = True, **kwargs) → Union[Model, RegulatoryModel][source]¶ Reading a mewpy regulatory model encoded into a CSV file. It can only return a regulatory model from the CSV file.
Only one CSV file is accepted. The CSV file type can be explicitly set using the boolean (default), co_expression or target_regulator flags. Consult the Engines enumerator in mewpy.io.engines for further detail on this CSV file types.
The CSV file is closed upon reading or failure
Parameters: - io – A valid string path or IO.
- boolean – Whether the file is a Boolean-based regulatory CSV file
- co_expression – Whether the file is a CoExpression (co-activating and co-repressing) regulatory CSV file
- target_regulator – Whether the file is a Target-Regulator interaction-based regulatory CSV file
- warnings – Whether to launch warnings found during reading
Returns: mewpy regulatory model
-
mewpy.io.
read_json
(io: Union[str, pathlib.Path, TextIOWrapper], warnings: bool = True) → Union[Model, MetabolicModel, RegulatoryModel][source]¶ Reading a GERM model encoded into a JSON file. It can return a metabolic, regulatory or metabolic-regulatory model from the JSON file according to the JSON file.
Only one JSON file is accepted. The GERM model is built according to the JSON file content.
The JSON file is closed upon reading or failure
Parameters: - io – A valid string path or IO.
- warnings – Whether to launch warnings found during reading
Returns: mewpy metabolic, regulatory or both model
-
mewpy.io.
read_model
(*readers, warnings: bool = True) → Union[Model, RegulatoryModel, MetabolicModel][source]¶ Reading a GERM model encoded into one or more file types (e.g. sbml, csv, cobrapy, reframed, json, etc). It can return a metabolic, regulatory or metabolic-regulatory model from multiple files.
A reader must be provided for each file type. Reading will take place according to the order and settings of the each reader.
All files are closed upon reading or failure
Parameters: - readers – Multiple Reader instances that will be used to read multiple file types into a single GERM model
- warnings – Whether to launch warnings found during reading
Returns: mewpy metabolic, regulatory or both model
-
mewpy.io.
read_sbml
(io: Union[str, pathlib.Path, TextIOWrapper], metabolic: bool = True, regulatory: bool = True, warnings: bool = True) → Union[Model, RegulatoryModel, MetabolicModel][source]¶ Reading a GERM model encoded into a SBML file. It can return a metabolic, regulatory or metabolic-regulatory model from the SBML file according to the metabolic and regulatory flags.
Only one SBML file is accepted. Thus, if the SBML file does not encode an SBML-qual or SBML-fbc plugin, but either metabolic or regulatory reading are requested, an IO error will be raised, respectively.
The SBML file is closed upon reading or failure
Parameters: - io – A valid string path or IO.
- metabolic – Whether to read the metabolic share of the SBML file, namely the fbc plugin
- regulatory – Whether to read the regulatory share of the SBML file, namely the qual plugin
- warnings – Whether to launch warnings found during reading
Returns: mewpy metabolic, regulatory or both model
-
mewpy.io.
write_model
(*writers, warnings=True) → Union[Model, RegulatoryModel, MetabolicModel][source]¶ Writing a GERM model into one or more file types (e.g. sbml, csv, cobrapy, reframed, json, etc). It can write a metabolic, regulatory or metabolic-regulatory model to multiple files.
A writer must be provided for each file type. Reading will take place according to the order and settings of the each writer.
All files are closed upon writing or failure
Parameters: - writers – Multiple Writer instances that will be used to write multiple file types from a single GERM model
- warnings – Whether to launch warnings found during reading
Returns: mewpy metabolic, regulatory or both model
mewpy.model package¶
Submodules¶
mewpy.model.gecko module¶
-
class
mewpy.model.gecko.
GeckoModel
(model, protein_properties=None, sigma=0.46, c_base=0.3855, gam=36.6, amino_acid_polymerization_cost=37.7, carbohydrate_polymerization_cost=12.8, biomass_reaction_id=None, protein_reaction_id='r_4047', carbohydrate_reaction_id='r_4048', protein_pool_exchange_id='prot_pool_exchange', common_protein_pool_id='prot_pool_c', reaction_prefix='')[source]¶ Bases:
reframed.core.cbmodel.CBModel
Class for representing GECKO models. Addapted from the original (https://github.com/SysBioChalmers/GECKO/) to reframed.
Implement a model class for Genome-scale model to account for Enzyme Constraints, using Kinetics and Omics [1].
Parameters: - model – str, A CBModel to apply protein constraints to. Can be ‘single-pool’ for the bundled ecYeast7 model using only a single pool for all proteins, or ‘multi-pool’ for the model that has separate pools for all measured proteins.
- protein_properties – pd.DataFrame, A data frame that defined molecular weight (g/mol) ‘mw’, for ‘uniprot’ proteins and their average ‘abundance’ in ppm.
- sigma – float, The parameter adjusting how much of a protein pool can take part in reactions. Fitted parameter, default is optimized for chemostat experiment in [1].
- gam – float, The growth associated maintenance cost in mmol / gDW. Default fitted for yeast 8.1.3.
- amino_acid_polymerization_cost – float, The cost for turning amino-acids in proteins in mmol / g. Default taken from [2].
- carbohydrate_polymerization_cost – float, The cost for turning monosaccharides in polysaccharides in mmol / g. Default taken from [2].
- c_base – float, The carbohydrate content at dilution rate 0.1 / h. Default taken from yeast 8.1.3.
- biomass_reaction_id – str, The identifier for the biomass reaction.
- protein_reaction_id – str, The identifier for the protein reaction.
- carbohydrate_reaction_id – str, The identifier for the carbohydrate reaction.
- protein_pool_exchange_id – str, The identifier of the protein pool exchange reaction.
- common_protein_pool_id – str, The identifier of the metabolite representing the common protein pool.
References:
[1] (1, 2) Benjamin J. Sanchez, Cheng Zhang, Avlant Nilsson, Petri-Jaan Lahtvee, Eduard J. Kerkhoven, Jens Nielsen ( 2017). Improving the phenotype predictions of a yeast genome-scale metabolic model by incorporating enzymatic constraints. [Molecular Systems Biology, 13(8): 935, http://www.dx.doi.org/10.15252/msb.20167411
[2] J. Förster, I. Famili, B. Ø. Palsson and J. Nielsen, Genome Res., 2003, 244–253.
-
adjust_biomass_composition
()[source]¶ Adjust the biomass composition.
After changing the protein and carbohydrate content based on measurements, adjust the corresponding coefficients of the biomass reaction.
-
adjust_pool_bounds
(min_objective=0.05, inplace=False, tolerance=1e-09)[source]¶ Adjust protein pool bounds minimally to make model feasible.
Bounds from measurements can make the model non-viable or even infeasible. Adjust these minimally by minimizing the positive deviation from the measured values.
Parameters: - min_objective – float, The minimum value of for the ojective for calling the model viable.
- inplace – bool, Apply the adjustments to the model.
- tolerance – float, Minimum non-zero value. Solver specific value.
Returns: pd.DataFrame, Data frame with the series ‘original’ bounds and the new ‘adjusted’ bound, and the optimized ‘addition’.
-
constrain_pool
()[source]¶ Constrain the draw reactions for the unmeasured (common protein pool) proteins.
Proteins without their own protein pool are collectively constrained by the common protein pool. Remove protein pools for all proteins that don’t have measurements, along with corresponding draw reactions, and add these to the common protein pool and reaction.
-
fraction_to_ggdw
(fraction)[source]¶ Convert protein measurements in mass fraction of total to g protein / g DW.
Parameters: fraction – pd.Series, Data of protein measurements which are absolute quantitative fractions of the total amount of these measured proteins. Normalized to sum == 1. Returns: pd.Series, g protein / g DW for the measured proteins
-
individual_protein_exchanges
¶ Individual protein-exchange reactions.
Returns: frozenset, Set of protein exchange reactions with individual pools
-
individual_proteins
¶ Get the identifiers for the proteins with their individual abundance pool.
Returns: frozenset, The set of proteins that have a defined separate pool exchange reaction.
-
limit_proteins
(fractions=None, ggdw=None, p_total=0.448, p_base=0.46)[source]¶ Apply proteomics measurements to model.
Apply measurements in the form of fractions of total of the measured proteins, or directly as g / gDW. Must supply exactly one of fractions or ggdw.
Parameters: - fractions – pd.Series, Protein abundances in fraction of total (normalized to sum to 1). Ignored if ggdw is also supplied.
- ggdw – pd.Series, Protein abundances in g / gDW
- p_total – float, measured total protein fraction in cell in g protein / g DW. Should be measured for each experiment, the default here is taken from [2].
- p_base – float, protein content at dilution rate 0.1 / h in g protein / g DW. Default taken from yeast 8.1.3.
References:
[2] (1, 2, 3) Benjamin J. Sanchez, Cheng Zhang, Avlant Nilsson, Petri-Jaan Lahtvee, Eduard J. Kerkhoven, Jens Nielsen ( 2017). Improving the phenotype predictions of a yeast genome-scale metabolic model by incorporating enzymatic constraints. [Molecular Systems Biology, 13(8): 935, http://www.dx.doi.org/10.15252/msb.20167411
-
measured_proteins
¶ Get the identifiers of the measured proteins.
Returns: frozenset, The identifiers for the unmeasured proteins.
-
pool_protein_exchanges
¶ Protein-exchange reactions by single pool.
Returns: frozenset, Set of protein exchange reactions for single pool reactions.
-
pool_proteins
¶ Get proteins modeled by common protein pool.
Returns: frozenset, The set of proteins that have a defined draw reaction.
-
protein_exchanges
¶ - Protein-exchange reactions.
- fs_matched_adjusted
Returns: frozenset, Set of protein exchange reactions (individual and common protein pool reactions).
-
protein_rev_reactions
¶ Pairs of reverse reactions associated with a protein
Returns: dictionaty, A dictionary which identifies for each protein (key) the list of reversible reactions pairs.
-
proteins
¶ Get all proteins.
Returns: frozenset, The set of all proteins identifiers.
-
unmeasured_proteins
¶ Get the identifiers of the proteins .
Returns: frozenset, The protein identifiers for the measured proteins.
mewpy.model.smoment module¶
Module contents¶
mewpy.optimization package¶
Subpackages¶
-
class
mewpy.optimization.inspyred.ea.
EA
(problem, initial_population=[], max_generations=100, mp=True, visualizer=False, algorithm=None, **kwargs)[source]¶ Bases:
mewpy.optimization.ea.AbstractEA
EA running helper
Parameters: - problem – the optimization problem.
- initial_population – (list) the EA initial population.
- max_generations – (int) the number of iterations of the EA (stopping criteria).
-
class
mewpy.optimization.inspyred.observers.
VisualizerObserver
(reference_front=None, reference_point=None, display_frequency=1, axis_labels=None, non_dominated=False, print_stats=True)[source]¶ Bases:
object
-
mewpy.optimization.inspyred.observers.
fitness_statistics
(population, directions)[source]¶ Return the basic statistics of the population’s fitness values.
Parameters: population – A population of individuals.
-
mewpy.optimization.inspyred.observers.
results_observer
(population, num_generations, num_evaluations, args)[source]¶ Print the output of the evolutionary computation to a file with the follow fields: - number of generation - fitness of candidate - the solution candidates - the solution encoded candidates
Parameters: - population – (list) the population of Individuals.
- num_generations – (int) the number of elapsed generations.
- num_evaluations – (int) the number of evaluations already performed.
- args – (dict) a dictionary of keyword arguments.
-
mewpy.optimization.inspyred.operators.
gaussian_mutation
(random, candidate, args)[source]¶ A Gaussian mutator centered in the gene[i] value
-
mewpy.optimization.inspyred.operators.
grow_mutation_KO
(random, candidate, args)[source]¶ Returns the mutant produced by a grow mutation on the candidate (when the representation is a set of integers). If a candidate solution has the maximum size candidate allowed, this function leaves it unchanged.
Parameters: - random – the random number generator object.
- candidate – the candidate solution.
- args – a dictionary of keyword arguments.
Returns: A set with a new candidate.
Notes:
Optional keyword arguments in args: - mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
grow_mutation_OU
(random, candidate, args)[source]¶ Returns the mutant produced by a grow mutation on the candidate (when the representation is a set of integers). If a candidate solution has the maximum size candidate allowed, this function leaves it unchanged.
Parameters: - random – the random number generator object.
- candidate – the candidate solution.
- args – a dictionary of keyword arguments.
Returns: A set with a new candidate.
Notes:
Optional keyword arguments in args: - mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
real_arithmetical_crossover
(random, mom, dad, args)[source]¶ Random trade off of n genes from the progenitors The maximum number of trade off is defined by ‘num_mix_points’ For a gene position i and a randmon value a in range 0 to 1
child_1[i] = a * parent_1[i] + (1-a) * parent_2[i] child_2[i] = (1-a) * parent_1[i] + a * parent_2[i]
-
mewpy.optimization.inspyred.operators.
shrink_mutation
(random, candidate, args)[source]¶ Returns the mutant produced by shrink mutation on the candidate. If a candidate solution has length of 1, this function leaves it unchanged.
Parameters: - random – the random number generator object. Must implement random().
- candidate – the candidate solution.
Parm args: a dictionary of keyword arguments.
Returns: A set of new candidate
- Notes:
- Optional keyword arguments in args: - mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
single_mutation_KO
(random, candidate, args)[source]¶ Returns the mutant produced by a single mutation on the candidate (when the representation is a set of integers). The candidate size is maintained.
Parameters: - random – the random number generator object.
- candidate – the candidate solution.
- args – a dictionary of keyword arguments.
Returns: A set with a new candidate.
Optional keyword arguments in args:
- mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
single_mutation_OU
(random, candidate, args)[source]¶ Returns the mutant produced by one mutation on the candidate (when the representation is a set of (int,int)). The candidate size is maintained.
Parameters: - random – the random number generator object.
- candidate – the candidate solution.
- args – a dictionary of keyword arguments.
Returns: A set with a new candidate.
Optional keyword arguments in args:
- mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
single_mutation_OU_level
(random, candidate, args)[source]¶ Returns the mutant produced by one mutation on the candidate (when the representation is a set of (int,int)). The candidate size is maintained.
Parameters: - random – the random number generator object.
- candidate – the candidate solution.
- args – a dictionary of keyword arguments.
Returns: A set with a new candidate.
Optional keyword arguments in args:
- mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
single_real_mutation
(random, candidate, args)[source]¶ Returns the mutant produced by a single mutation on the candidate (when the representation is a set of integers). The candidate size is maintained.
random : the random number generator object candidate : the candidate solution args : a dictionary of keyword arguments
out : new candidate
Optional keyword arguments in args:
- mutation_rate – the rate at which mutation is performed (default 0.1)
-
mewpy.optimization.inspyred.operators.
uniform_crossover_KO
(random, mom, dad, args)[source]¶ Return the offspring of the uniform crossover on the candidate. Based on two candidates (parents) build 2 children: - elements present in both parents will be present in both children; - both children have at least one element; - elements present in only one parent have equal probability to be present in child 1 or child 2 (after each child has at least one element).
Parameters: - random – the random number generator object. Must implement random().
- mom – the first parent candidate
- dad – the second parent candidate
- args – a dictionary of keyword arguments.
- crossover_rate – The rate at which crossover is performed (default 1.0).
Returns: Return the offspring of the candidates given as argument.
-
mewpy.optimization.inspyred.operators.
uniform_crossover_OU
(random, mom, dad, args)[source]¶ Return the offspring of the uniform crossover on the candidate. Based on two candidates (parents) build 2 children: - elements present in both parents will be present in both children; - both children have at least one element; - elements present in only one parent have equal probability to be present in child 1 or child 2 (after each child has at least one element).
Parameters: - random – the random number generator object. Must implement random().
- mom – the first parent candidate
- dad – the second parent candidate
- args – a dictionary of keyword arguments.
Returns: Return the offspring of the candidates given as argument.
- Notes:
- Optional keyword arguments in args: - crossover_rate – the rate at which crossover is performed (default 1.0)
-
class
mewpy.optimization.inspyred.problem.
InspyredProblem
(problem, directions)[source]¶ Bases:
mewpy.util.process.Evaluable
Inspyred EA builder helper.
Parameters: problem – the optimization problem.
-
class
mewpy.optimization.jmetal.ea.
EA
(problem, initial_population=[], max_generations=100, mp=True, visualizer=False, algorithm=None, **kwargs)[source]¶ Bases:
mewpy.optimization.ea.AbstractEA
EA running helper for JMetal.
Parameters: - problem – The optimization problem.
- initial_population – (list) The EA initial population.
- max_generations – (int) The number of iterations of the EA (stopping criteria).
-
class
mewpy.optimization.jmetal.observers.
PrintObjectivesStatObserver
(frequency: float = 1.0)[source]¶ Bases:
object
-
class
mewpy.optimization.jmetal.operators.
GaussianMutation
(probability: float = 0.1, gaussian_mutation_rate: float = 0.1, gaussian_mean: float = 0.0, gaussian_std: float = 1.0)[source]¶ Bases:
jmetal.core.operator.Mutation
A Gaussian mutator
-
class
mewpy.optimization.jmetal.operators.
GrowMutationKO
(probability: float = 0.1, max_size: int = 10)[source]¶ Bases:
jmetal.core.operator.Mutation
Grow mutation. A gene is added to the solution.
Parameters: - probability – (float), The mutation probability.
- min_size – (int) the solution minimum size.
-
class
mewpy.optimization.jmetal.operators.
GrowMutationOU
(probability: float = 0.1, max_size: int = 10)[source]¶ Bases:
jmetal.core.operator.Mutation
Grow mutation. A gene is added to the solution.
Parameters: - probability – (float), The mutation probability.
- min_size – (int) the solution minimum size.
-
class
mewpy.optimization.jmetal.operators.
MutationContainer
(probability: float = 0.5, mutators=[])[source]¶ Bases:
jmetal.core.operator.Mutation
A container for the mutation operators.
Parameters: - probability – (float) The probability of applying a mutation.
- mutators – (list) The list of mutators.
-
class
mewpy.optimization.jmetal.operators.
ShrinkMutation
(probability: float = 0.1, min_size: int = 1)[source]¶ Bases:
jmetal.core.operator.Mutation
Shrink mutation. A gene is removed from the solution.
Parameters: - probability – (float), The mutation probability.
- min_size – (int) the solution minimum size.
-
class
mewpy.optimization.jmetal.operators.
SingleMutationKO
(probability: float = 0.1)[source]¶ Bases:
jmetal.core.operator.Mutation
Mutates a single element
-
class
mewpy.optimization.jmetal.operators.
SingleMutationOU
(probability: float = 0.1)[source]¶ Bases:
jmetal.core.operator.Mutation
Mutates a single element
-
class
mewpy.optimization.jmetal.operators.
SingleMutationOULevel
(probability: float = 0.1)[source]¶ Bases:
jmetal.core.operator.Mutation
Mutates the expression level of a single element
-
class
mewpy.optimization.jmetal.operators.
SingleRealMutation
(probability: float = 0.1)[source]¶ Bases:
jmetal.core.operator.Mutation
Mutates a single element
-
class
mewpy.optimization.jmetal.operators.
UniformCrossover
(probability: float = 0.1)[source]¶ Bases:
jmetal.core.operator.Crossover
Uniform Crossover
-
class
mewpy.optimization.jmetal.operators.
UniformCrossoverKO
(probability: float = 0.1, max_size: int = 10)[source]¶ Bases:
jmetal.core.operator.Crossover
Uniform Crossover for KO solutions
Parameters: - probability – (float) The probability of crossover.
- max_size – (int) The solution maximum size.
-
class
mewpy.optimization.jmetal.operators.
UniformCrossoverOU
(probability: float = 0.1, max_size: int = 10)[source]¶ Bases:
jmetal.core.operator.Crossover
Uniform Crossover for OU solutions
-
class
mewpy.optimization.jmetal.problem.
JMetalKOProblem
(problem, initial_polulation)[source]¶ Bases:
jmetal.core.problem.Problem
,mewpy.util.process.Evaluable
-
create_solution
() → mewpy.optimization.jmetal.problem.KOSolution[source]¶ Creates a random_search solution to the problem.
Returns: Solution.
-
evaluate
(solution: mewpy.optimization.jmetal.problem.KOSolution) → mewpy.optimization.jmetal.problem.KOSolution[source]¶ Evaluate a solution. For any new problem inheriting from
Problem
, this method should be replaced. Note that this framework ASSUMES minimization, thus solutions must be evaluated in consequence.Returns: Evaluated solution.
-
-
class
mewpy.optimization.jmetal.problem.
JMetalOUProblem
(problem, initial_polulation=[])[source]¶ Bases:
jmetal.core.problem.Problem
,mewpy.util.process.Evaluable
-
create_solution
() → mewpy.optimization.jmetal.problem.OUSolution[source]¶ Creates a random_search solution to the problem.
Returns: Solution.
-
evaluate
(solution: mewpy.optimization.jmetal.problem.KOSolution) → mewpy.optimization.jmetal.problem.KOSolution[source]¶ Evaluate a solution. For any new problem inheriting from
Problem
, this method should be replaced. Note that this framework ASSUMES minimization, thus solutions must be evaluated in consequence.Returns: Evaluated solution.
-
-
class
mewpy.optimization.jmetal.problem.
KOSolution
(lower_bound: int, upper_bound: int, number_of_variables: int, number_of_objectives: int, number_of_constraints: int = 0)[source]¶ Bases:
jmetal.core.solution.Solution
,mewpy.optimization.ea.SolutionInterface
Class representing a KO solution
-
class
mewpy.optimization.jmetal.problem.
OUSolution
(lower_bound: List[int], upper_bound: List[int], number_of_variables: int, number_of_objectives: int)[source]¶ Bases:
jmetal.core.solution.Solution
,mewpy.optimization.ea.SolutionInterface
Class representing a Over/Under expression solution.
Submodules¶
mewpy.optimization.ea module¶
-
class
mewpy.optimization.ea.
AbstractEA
(problem: AbstractProblem, initial_population: List[T] = [], max_generations: int = 100, mp: bool = True, visualizer: bool = False, **kwargs)[source]¶ Bases:
abc.ABC
-
dataframe
()[source]¶ Returns a dataframe of the final population.
Raises: Exception – if the final population is empty or None. Returns: Returns a dataframe of the final population Return type: pandas.Dataframe
-
-
class
mewpy.optimization.ea.
Solution
(values: Any, fitness: List[float], constraints: Dict[str, Union[float, Tuple[float, float]]] = None, is_maximize: bool = True)[source]¶
-
class
mewpy.optimization.ea.
SolutionInterface
[source]¶ Bases:
abc.ABC
An interface for EA solutions.
-
mewpy.optimization.ea.
cmetric
(pf1, pf2, maximize=True)[source]¶ Computes the c-metric quality indicator.
Parameters: - pf1 – The first pareto front.
- pf2 – The second pareto front.
- maximize – (bool) maximization (True) or minimization (False).
Returns: r1,r2,pf1_2,pf2_1 r1: percentage of solutions on pf2 dominated by some solution on pf1; r2: percentage of solutions on pf1 dominated by some solution on pf2; pf1_2: solutions on pf2 dominated by some solution on pf1; pf2_1: solutions on pf1 dominated by some solution on pf2.
-
mewpy.optimization.ea.
dominance_test
(solution1, solution2, maximize=True)[source]¶ Testes Pareto dominance
Parameters: - solution1 – The first solution.
- solution2 – The second solution.
- maximize – (bool) maximization (True) or minimization (False)
Returns: 1 : if the first solution dominates the second; -1 : if the second solution dominates the first; 0 : if non of the solutions dominates the other.
mewpy.optimization.evaluation module¶
-
class
mewpy.optimization.evaluation.
AggregatedSum
(fevaluation: List[mewpy.optimization.evaluation.EvaluationFunction], tradeoffs: List[float] = None, maximize: bool = True)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
,mewpy.optimization.evaluation.KineticEvaluationFunction
-
class
mewpy.optimization.evaluation.
BPCY
(biomass: str, product: str, uptake: str = None, maximize: bool = True, **kwargs)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
-
class
mewpy.optimization.evaluation.
BPCY_FVA
(biomass: str, product: str, uptake: str = None, maximize: bool = True, **kwargs)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
-
class
mewpy.optimization.evaluation.
CNRFA
(reactions: List[str], threshold: float = 0.1, maximize: bool = True, **kwargs)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
-
class
mewpy.optimization.evaluation.
CandidateSize
(maximize: bool = False)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
,mewpy.optimization.evaluation.KineticEvaluationFunction
-
get_fitness
(simulResult, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
-
class
mewpy.optimization.evaluation.
EvaluationFunction
(maximize: bool = True, worst_fitness: float = 0.0)[source]¶ Bases:
object
-
get_fitness
(simul_results, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
no_solution
¶ Value to be retuned for wost case evaluation
-
-
class
mewpy.optimization.evaluation.
FluxDistance
(fluxes: Dict[str, float], maximize: bool = False, worst_fitness: float = 1000)[source]¶
-
class
mewpy.optimization.evaluation.
KineticEvaluationFunction
(maximize=True, worst_fitness=0.0)[source]¶
-
class
mewpy.optimization.evaluation.
ModificationType
(penalizations: Dict[str, float] = {'KO': 5, 'OE': 0, 'UE': 2}, maximize: bool = True)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
,mewpy.optimization.evaluation.KineticEvaluationFunction
-
get_fitness
(simulResult, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
-
class
mewpy.optimization.evaluation.
MolecularWeight
(reactions: List[str], maximize: bool = False, **kwargs)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
-
get_fitness
(simul_results, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
-
class
mewpy.optimization.evaluation.
PhenotypeEvaluationFunction
(maximize=True, worst_fitness=0.0)[source]¶
-
class
mewpy.optimization.evaluation.
TargetFlux
(reaction: str, biomass: str = None, maximize: bool = True, min_biomass_value: float = None, min_biomass_per: float = 0.0, method: Union[str, mewpy.simulation.simulation.SimulationMethod] = <SimulationMethod.pFBA: 'pFBA'>)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
,mewpy.optimization.evaluation.KineticEvaluationFunction
-
get_fitness
(simul_results, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
-
class
mewpy.optimization.evaluation.
TargetFluxWithConstraints
(reaction: str, constraints: Dict[str, Union[float, Tuple[float, float]]], maximize: bool = False)[source]¶ Bases:
mewpy.optimization.evaluation.EvaluationFunction
-
get_fitness
(simul_results, candidate, **kwargs)[source]¶ Evaluates a candidate
Parameters: - simul_results – (dic) A dictionary of phenotype SimulationResult objects
- candidate – Candidate beeing evaluated
Returns: A fitness value.
-
-
class
mewpy.optimization.evaluation.
WYIELD
(biomassId: str, productId: str, maximize: bool = True, **kwargs)[source]¶ Bases:
mewpy.optimization.evaluation.PhenotypeEvaluationFunction
Module contents¶
-
mewpy.optimization.
EA
(problem, initial_population=[], max_generations=100, mp=True, visualizer=False, algorithm=None, **kwargs)[source]¶ EA running helper. Returns an instance of the EA that reflects the global user configuration settings such as preferred engine and algorithm.
Parameters: - problem – The optimization problem.
- initial_population (list) – The EA initial population. Default [].
- max_generations (int) – The number of iterations of the EA (stopping criteria). Default globally defined.
- mp (bool) – If multiprocessing should be used. Default True.
- visualizer (bool) – If the pareto font should be displayed. Requires a graphic environment. Default False.
Additional optional arguments:
Parameters: population_size (int) – EA population size. Returns: An instance of an EA optimizer.
mewpy.problems package¶
Submodules¶
mewpy.problems.gecko module¶
Author: Vitor Pereira Contributors: Sergio Salgado Briegas ##############################################################################
-
class
mewpy.problems.gecko.
GeckoKOProblem
(model: Union[GeckoModel, GECKOModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractKOProblem
Gecko KnockOut Optimization Problem
Parameters: - model – The constraint based metabolic model.
- fevaluation (list) – a list of callable EvaluationFunctions.
Optional:
Parameters: - envcond (OrderedDict) – environmental conditions.
- constraints (OrderedDict) – additional constraints to be applied to the model.
- candidate_min_size (int) – The candidates minimum size.
- candidate_max_size (int) – The candidates maximum size.
- target (list) – List of target reactions.
- non_target (list) – List of non target reactions. Not considered if a target list is provided.
- scalefactor (float) – a scaling factor to be used in the LP formulation.
- prot_prefix (str) – the protein draw reaction prefix. Default draw_prot_.
Note:
Targets as well as non target proteins are defined using their prot id, ex P0351, and not by the associated draw reaction id, ex draw_prot_P0351.
-
class
mewpy.problems.gecko.
GeckoOUProblem
(model: Union[GeckoModel, GECKOModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractOUProblem
Gecko Under/Over expression Optimization Problem
Parameters: - (metabolic model) (model) – The constraint based metabolic model.
- (list) (fevaluation) – a list of callable EvaluationFunctions.
Optional:
Parameters: - envcond (OrderedDict) – environmental conditions.
- constraints (OrderedDict) – additional constraints to be applied to the model.
- candidate_min_size (int) – The candidates minimum size.
- candidate_max_size (int) – The candidates maximum size.
- target (list) – List of target reactions.
- non_target (list) – List of non target reactions. Not considered if a target list is provided.
- scalefactor (float) – a scaling factor to be used in the LP formulation.
- reference (dic) – Dictionary of flux values to be used in the over/under expression values computation.
- prot_prefix (str) – the protein draw reaction prefix. Default draw_prot_.
- twostep (boolean) – If deletions should be applied before identifiying reference flux values.
- partial_solution (dict) – A partial solution to be appended to any other solution
Note: Target as well as non target proteins are defined with their prot id, ex P0351, and with the associated reaction id, ex draw_prot_P0351.
-
encode
(candidate)[source]¶ Translates a candidate solution in problem specific representation to an iterable of ids, or (ids, folds).
Parameters: candidate (iterable) – The candidate representation. Returns: a list of index tupple (modification_target_index,level_index). The indexes are problem dependent.
-
solution_to_constraints
(candidate)[source]¶ Converts a candidate, a dict {protein:lv}, into a dictionary of constraints Reverseble reactions associated to proteins with over expression are KO according to the flux volume in the wild type.
Parameters: candidate – The candidate to be decoded. Returns: A dictionary of metabolic constraints.
-
class
mewpy.problems.gecko.
KcatOptProblem
(model: Union[GeckoModel, GECKOModel], proteins: List[str], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractOUProblem
-
evaluate_solution
(solution, decode=True)[source]¶ Evaluates a single solution, a list of constraints.
Parameters: - solution – The solution to be evaluated.
- decode – If the solution needs to be decoded.
Returns: A list of fitness.
-
generator
(random, **kwargs)[source]¶ Generates a solution, a random (int,int) set with length in range min_solution_size to max_solution_size.
Parameters: - random – A random number generator. Needs to implement uniform and randint generators methods.
- args (dict) – A dictionary of additional parameters.
Returns: A new solution.
-
solution_to_constraints
(solution: Dict[Tuple[str, str], float]) → Simulator[source]¶ Converts a solution to an instance of Simulator. Contrary to the usual signature of the method, it does not return a set of contrainsts but an instance of Simulator wrapping a model with modified kcats.
Parameters: solution – a solution, a dictionary of {(ProteinID,ReactionID): kcat value} Returns: A Simulator
-
mewpy.problems.genes module¶
approach employed in omics integration by substituting the logical operators (AND,OR) by min and sum|max functions.
-
class
mewpy.problems.genes.
GKOProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractKOProblem
Gene Knockout Optimization Problem.
Parameters: - model – The constraint metabolic model.
- fevaluation (list) – A list of callable EvaluationFunctions.
Optional:
Parameters: - envcond (OrderedDict) – Environmental conditions.
- constraints (OrderedDict) – Additional constraints to be applied to the model.
- candidate_min_size (int) – The candidate minimum size (Default EAConstants.MIN_SOLUTION_SIZE)
- candidate_max_size (int) – The candidate maximum size (Default EAConstants.MAX_SOLUTION_SIZE)
- target (list) – List of modification target genes.
- non_target (list) – List of non target genes. Not considered if a target list is provided.
- scalefactor (float) – A scaling factor to be used in the LP formulation.
-
class
mewpy.problems.genes.
GOUProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractOUProblem
Gene Over/Under expression Optimization Problem
Parameters: - model – The constraint metabolic model.
- fevaluation (list) – A list of callable EvaluationFunctions.
Optional:
Parameters: - envcond (OrderedDict) – Environmental conditions.
- constraints (OrderedDict) – Additional constraints to be applied to the model.
- candidate_min_size (int) – The candidate minimum size (Default EAConstants.MIN_SOLUTION_SIZE)
- candidate_max_size (int) – The candidate maximum size (Default EAConstants.MAX_SOLUTION_SIZE)
- target (list) – List of modification target genes.
- non_target (list) – List of non target genes. Not considered if a target list is provided.
- scalefactor (float) – A scaling factor to be used in the LP formulation.
- reference (dic) – Dictionary of flux values to be used in the over/under expression values computation.
- operators (tuple) – (and, or) operations. Default (MIN, MAX).
- levels (list) – Over/under expression levels (Default EAConstants.LEVELS).
- twostep (boolean) – If deletions should be applied before identifiying reference flux values.
- partial_solution (dict) – A partial solution to be appended to any other solution
Note: Operators that can not be pickled may be defined by a string e.g. ‘lambda x,y: (x+y)/2’.
mewpy.problems.problem module¶
-
class
mewpy.problems.problem.
AbstractKOProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractProblem
-
bounder
¶ The KO list index bounder
-
encode
(candidate)[source]¶ Translates a candidate solution in problem specific representation to an iterable of ids, or (ids, folds).
Parameters: candidate – The candidate representation.
-
-
class
mewpy.problems.problem.
AbstractOUProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractProblem
Base class for Over/Under expression optimization problems
-
bounder
¶ The list and levels index bounder
Returns: a OUBounder object.
-
decode
(candidate)[source]¶ The decoder function for the problem. Needs to be implemented by extending classes.
-
encode
(candidate)[source]¶ Translates a candidate solution in problem specific representation to an iterable of ids, or (ids, folds).
Parameters: candidate (iterable) – The candidate representation. Returns: a list of index tupple (modification_target_index,level_index). The indexes are problem dependent.
-
generator
(random, **kwargs)[source]¶ Generates a solution, a random (int,int) set with length in range min_solution_size to max_solution_size.
Parameters: - random – A random number generator. Needs to implement uniform and randint generators methods.
- args (dict) – A dictionary of additional parameters.
Returns: A new solution.
-
ou_constraint
(level: Union[int, float], wt: float)[source]¶ Computes the bounds for a reaction.
Parameters: - level (float) – The expression level for the reaction.
- wt (float) – The reference reaction flux.
Returns: A tupple, flux bounds for the reaction.
-
reaction_constraints
(rxn: str, lv: Union[int, float], reference: Dict[str, float])[source]¶ Converts a (reaction, level) pair into a constraint If a reaction is reversible, the direction with no or less wild type flux is knocked out.
Parameters: - rxn – The reaction identifier.
- lv – the wild type multiplier factor. The reaction bounds are altered accordingly.
Returns: A dictionary of reaction constraints.
-
reference
¶ Returns: A dictionary of reference flux values.
-
-
class
mewpy.problems.problem.
AbstractProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
abc.ABC
-
evaluate_solution
(solution, decode=True)[source]¶ Evaluates a single solution, a list of constraints.
Parameters: - solution – The solution to be evaluated.
- decode – If the solution needs to be decoded.
Returns: A list of fitness.
-
is_maximization
¶
-
simplify
(solution, tolerance=1e-06)[source]¶ Simplify a solution by removing the modification that do not affect the final fitness value. Two solutions are considered different if the maximum allowed difference between objective values is exceeded. Tolerance may be defined by a single float value, or per objective by setting a list of floats of size equal to the number of objectives.
Parameters: - solution – the solution to be simplified
- tolerance (float) – The maximum allowed difference between objective values.
Returns: A list of simplified solutions.
-
simplify_population
(population, n_cpu=1)[source]¶ Simplifies a population of solutions
- Args:
- population (list): List of mewpy.optimization.ea.Solution n_cpu (int): Number of CPUs.
- Returns:
- list: Simplified population
-
simulate
(*args, **kwargs)[source]¶ Simulates a phenotype when applying a set of constraints using the specified method.
Parameters: - objective (dic) – The simulation objective. If none, the model objective is considered.
- method (str) – The SimulationMethod (FBA, pFBA, lMOMA, etc …)
- maximize (boolean) – The optimization direction
- constraints (dict) – A dictionary of constraints to be applied to the model.
- reference (dict) – A dictionary of reaction flux values.
- scalefactor (float) – A positive scaling factor for the solver. Default None.
- solver – An instance of the solver.
- solution (dict) – A solution to be converted to constraints in the context of the problem.
-
simulator
¶
-
target_list
¶ list of modification targets
-
-
class
mewpy.problems.problem.
KOBounder
(lower_bound, upper_bound)[source]¶ Bases:
object
A bounder of possible indexes in an enumeration
mewpy.problems.reactions module¶
-
class
mewpy.problems.reactions.
MediumProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractOUProblem
Medium Optimization Problem. Try to find an optimized uptake configuration. By default all uptake reactions are considered. Uptake reactions not included in a solution candidate are KO.
Parameters: - model – The constraint metabolic model.
- fevaluation (list) – A list of callable EvaluationFunctions.
Optional parameters:
Parameters: - envcond (OrderedDict) – Environmental conditions.
- constraints (OrderedDict) – Additional constraints to be applied to the model.
- candidate_min_size (int) – The candidate minimum size (Default EAConstants.MIN_SOLUTION_SIZE)
- candidate_max_size (int) – The candidate maximum size (Default EAConstants.MAX_SOLUTION_SIZE)
- target (list) – List of modification target reactions.
- non_target (list) – List of non target reactions. Not considered if a target list is provided.
- scalefactor (float) – A scaling factor to be used in the LP formulation.
- reference (dic) – Dictionary of flux values to be used in the over/under expression values computation.
- levels (list) – Over/under expression levels (Default [0,0.1,0.2,…,9.9,10.0])
-
class
mewpy.problems.reactions.
RKOProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractKOProblem
Reaction Knockout Optimization Problem.
Parameters: - model – The constraint metabolic model.
- fevaluation (list) – A list of callable EvaluationFunctions.
Optional parameters:
Parameters: - envcond (OrderedDict) – Environmental conditions.
- constraints (OrderedDict) – Additional constraints to be applied to the model.
- candidate_min_size (int) – The candidate minimum size (Default EAConstants.MIN_SOLUTION_SIZE)
- candidate_max_size (int) – The candidate maximum size (Default EAConstants.MAX_SOLUTION_SIZE)
- target (list) – List of modification target reactions.
- non_target (list) – List of non target reactions. Not considered if a target list is provided.
- scalefactor (float) – A scaling factor to be used in the LP formulation.
-
class
mewpy.problems.reactions.
ROUProblem
(model: Union[Model, CBModel], fevaluation: List[EvaluationFunction] = None, **kwargs)[source]¶ Bases:
mewpy.problems.problem.AbstractOUProblem
Reaction Over/Under Expression Optimization Problem
Parameters: - model – The constraint metabolic model.
- fevaluation (list) – A list of callable EvaluationFunctions.
Optional parameters:
Parameters: - envcond (OrderedDict) – Environmental conditions.
- constraints (OrderedDict) – Additional constraints to be applied to the model.
- candidate_min_size (int) – The candidate minimum size (Default EAConstants.MIN_SOLUTION_SIZE)
- candidate_max_size (int) – The candidate maximum size (Default EAConstants.MAX_SOLUTION_SIZE)
- target (list) – List of modification target reactions.
- non_target (list) – List of non target reactions. Not considered if a target list is provided.
- scalefactor (float) – A scaling factor to be used in the LP formulation.
- reference (dic) – Dictionary of flux values to be used in the over/under expression values computation.
- levels (list) – Over/under expression levels (Default EAConstants.LEVELS)
- twostep (boolean) – If deletions should be applied before identifiying reference flux values.
Module contents¶
mewpy.regulation package¶
Submodules¶
mewpy.regulation.RFBA module¶
mewpy.regulation.SRFBA module¶
mewpy.regulation.integrated_model module¶
mewpy.regulation.optorf module¶
mewpy.regulation.optram module¶
mewpy.regulation.regulatory_interaction module¶
mewpy.regulation.regulatory_model module¶
mewpy.regulation.regulatory_variable module¶
mewpy.regulation.variable module¶
Module contents¶
mewpy.simulation package¶
Submodules¶
mewpy.simulation.cobra module¶
-
class
mewpy.simulation.cobra.
CobraModelContainer
(model: cobra.core.model.Model = None)[source]¶ Bases:
mewpy.simulation.simulation.ModelContainer
A basic container for COBRApy models.
Parameters: model – A metabolic model. -
compartments
¶ Get the dictionary of current compartment descriptions.
Returns: the dictionary of current compartment Return type: dict
-
genes
¶ Lists the model gene identifiers
Returns: A list of gene identifiers Return type: list
-
get_compartment
(c_id: str) → Dict[str, Any][source]¶ Get a dictionary of a compartment descriptions.
Parameters: c_id (str) – The compartment identifier Returns: a dictionary of a compartment descriptions. Return type: dict
-
get_exchange_reactions
() → List[str][source]¶ Get the list of exchange reactions
Returns: The list of exchange reactions Return type: List[str]
-
get_gene_reactions
() → Dict[str, List[str]][source]¶ Get a map of genes to reactions.
Returns: A map of genes to reactions. Return type: Dict[str,List[str]]
-
get_gpr
(reaction_id: str) → str[source]¶ Returns the gpr rule (str) for a given reaction ID.
Parameters: reaction_id (str) – The reaction identifier. Returns: A string representation of the GPR rule.
-
get_metabolite
(m_id) → Dict[str, Any][source]¶ Returns a metabolite propeties.
Parameters: m_id (str) – The metabolite identifier Returns: A dictionary of properties Return type: Dict[str,Any]
-
get_reaction
(r_id: str) → Dict[str, dict][source]¶ Returns a dictionary of a reaction properties.
Parameters: r_id (str) – The reaction identifier. Returns: A dictionary of properties Return type: AttrDict
-
id
¶ The model identifier.
-
medium
¶ Returns the model default medium.
Returns: The constraints on the model exchanges. Return type: dict
-
metabolites
¶ Lists the model metabolite identifiers
Returns: A list of metabolite identifiers Return type: list
-
reactions
¶ The list of reaction identifiers
-
-
class
mewpy.simulation.cobra.
GeckoSimulation
(model, envcond=None, constraints=None, solver=None, reference=None, reset_solver=False, protein_prefix=None)[source]¶ Bases:
mewpy.simulation.cobra.Simulation
Simulator for geckopy.gecko.GeckoModel
-
find_proteins
(pattern=None, sort=False)[source]¶ A user friendly method to find proteins in the model.
Parameters: - pattern (str, optional) – The pattern which can be a regular expression, defaults to None in which case all entries are listed.
- sort (bool, optional) – if the search results should be sorted, defaults to False
Returns: the search results
Return type: pandas dataframe
-
get_Kcats
(protein: str)[source]¶ Returns a dictionary of reactions and respective Kcat for a specific protein/enzyme·
Params (str) protein: the protein identifier. Returns: A dictionary of reactions and respective Kcat values.
-
protein_rev_reactions
¶ Identify if a reaction is reversible and returns the reverse reaction if it is the case.
Parameters: reaction_id ((str)) – The reaction identifier. Returns: A reaction identifier or None.
-
proteins
¶ Returns: A list of proteins identifiers.
-
-
class
mewpy.simulation.cobra.
Simulation
(model: cobra.core.model.Model, envcond: Optional[dict] = None, constraints: Optional[dict] = None, solver=None, reference: Optional[dict] = None, reset_solver: bool = False)[source]¶ Bases:
mewpy.simulation.cobra.CobraModelContainer
,mewpy.simulation.simulation.Simulator
-
FVA
(reactions: Optional[List[str]] = None, obj_frac: float = 0.9, constraints: Dict[str, Union[float, Tuple[float, float]]] = None, loopless: bool = False, solver=None, format: bool = 'dict') → Union[dict, DataFrame][source]¶ Flux Variability Analysis (FVA).
Parameters: - model – An instance of a constraint-based model.
- obj_frac (float) – The minimum fraction of the maximum growth rate (default 0.9). Requires that the objective value is at least the fraction times maximum objective value. A value of 0.85 for instance means that the objective has to be at least at 85% percent of its maximum.
- reactions (list) – List of reactions to analyze (default: all).
- constraints (dic) – Additional constraints (optional).
- loopless (boolean) – Run looplessFBA internally (very slow) (default: false).
- solver – A pre-instantiated solver instance (optional).
- format – The return format: ‘dict’, returns a dictionary,’df’ returns a data frame.
Returns: A dictionary of flux variation ranges.
-
add_compartment
(comp_id, name=None, external=False)[source]¶ Adds a compartment
Parameters: - comp_id (str) – Compartment ID
- name (str) – Compartment name, default None
- external (bool) – If the compartment is external, default False.
-
add_reaction
(rxn_id, name=None, stoichiometry=None, reversible=True, lb=-10000, ub=10000, gpr=None, objective=0, replace=True, annotations={}, reaction_type=None)[source]¶ Adds a reaction to the model
Parameters: - rxn_id (str) – The reaction identifier
- name (str, optional) – The name of the reaction, defaults to None
- stoichiometry (dict[str,float], optional) – The stoichiometry of the reaction, defaults to None
- reversible (bool, optional) – If the reaction is reversible or not, defaults to True
- lb (float, optional) – The reaction lower bound, defaults to ModelConstants.REACTION_LOWER_BOUND
- ub (float, optional) – The reaction upper bound, defaults to ModelConstants.REACTION_UPPER_BOUND
- gpr (str, optional) – The gene-protein-reactio rule, defaults to None
- objective (float, optional) – the objective coeficient, defaults to 0
- replace (bool, optional) – If replaces the reaction with a same identifier, defaults to True
-
environmental_conditions
¶
-
find_bounds
()[source]¶ Return the median upper and lower bound of the metabolic model. Bounds can vary from model to model. Cobrapy defaults to (-1000, 1000).
-
get_reaction_bounds
(reaction_id)[source]¶ Returns the bounds for a given reaction.
Parameters: reaction_id – str, reaction ID Returns: lb(s), ub(s), tuple
-
metabolite_reaction_lookup
(force_recalculate=False)[source]¶ Return the network topology as a nested map from metabolite to reaction to coefficient. :return: a dictionary lookup table
-
objective
¶
-
remove_reaction
(r_id)[source]¶ Removes a reaction from the model.
- Args:
- r_id (str): The reaction identifier.
-
reverse_reaction
(reaction_id)[source]¶ Identify if a reaction is reversible and returns the reverse reaction if it is the case.
Parameters: reaction_id – A reaction identifier. Returns: A reaction identifier or None.
-
set_reaction_bounds
(reaction_id, lb, ub, track=True)[source]¶ Sets the bounds for a given reaction. :param reaction_id: str, reaction ID :param float lb: lower bound :param float ub: upper bound :param bool track: if the changes are to be logged. Default True
-
simulate
(objective: Dict[str, float] = None, method: Union[mewpy.simulation.simulation.SimulationMethod, str] = <SimulationMethod.FBA: 'FBA'>, maximize: bool = True, constraints: Dict[str, Union[float, Tuple[float, float]]] = None, reference: Dict[str, float] = None, scalefactor: float = None, solver=None, slim: bool = False, shadow_prices: bool = False) → mewpy.simulation.simulation.SimulationResult[source]¶ Simulates a phenotype when applying a set constraints using the specified method.
Parameters: - objective (dic) – The simulation objective. If none, the model objective is considered.
- method – The SimulationMethod (FBA, pFBA, lMOMA, etc …)
- maximize (boolean) – The optimization direction.
- constraints (dic) – A dictionary of constraints to be applied to the model.
- reference (dic) – A dictionary of reaction flux values.
- scalefactor (float) – A positive scaling factor for the solver. Default None.
-
mewpy.simulation.reframed module¶
-
class
mewpy.simulation.reframed.
CBModelContainer
(model: reframed.core.cbmodel.CBModel = None)[source]¶ Bases:
mewpy.simulation.simulation.ModelContainer
A basic container for REFRAMED models.
Parameters: model – A metabolic model. -
compartments
¶ Returns: A list of compartments identifiers.
-
genes
¶ Returns: A list of gene identifiers.
-
id
¶
-
medium
¶
-
metabolites
¶ Returns: A list of metabolite identifiers.
-
reactions
¶ Returns: A list of reaction identifiers.
-
-
class
mewpy.simulation.reframed.
GeckoSimulation
(model: mewpy.model.gecko.GeckoModel, envcond=None, constraints=None, solver=None, reference=None, reset_solver=False, protein_prefix=None)[source]¶ Bases:
mewpy.simulation.reframed.Simulation
-
get_Kcats
(protein: str)[source]¶ Returns a dictionary of reactions and respective Kcat for a specific protein/enzyme·
Params (str) protein: the protein identifier. Returns: A dictionary of reactions and respective Kcat values.
-
protein_reactions
(protein: str)[source]¶ Returns the list of reactions associated with a protein.
Params (str) protein: The protein identifier. Returns: A list of reaction identifiers
-
protein_rev_reactions
¶
-
proteins
¶ Returns: A list of proteins identifiers.
-
-
class
mewpy.simulation.reframed.
Simulation
(model: reframed.core.cbmodel.CBModel, envcond=None, constraints=None, solver=None, reference=None, reset_solver=False)[source]¶ Bases:
mewpy.simulation.reframed.CBModelContainer
,mewpy.simulation.simulation.Simulator
- Generic Simulation class for cobra Model.
- Defines the simulation conditions, and makes available a set of methods.
Parameters: model – An metabolic model instance. Optional: :param dic envcond: Dictionary of environmental conditions. :param dic constraints: A dictionary of reaction constraints. :param solver: An instance of the LP solver. :param dic reference: A dictionary of the wild type flux values.
-
FVA
(reactions=None, obj_frac=0.9, constraints=None, loopless=False, internal=None, solver=None, format='dict')[source]¶ Flux Variability Analysis (FVA).
Parameters: - model – An instance of a constraint-based model.
- obj_frac (float) – The minimum fraction of the maximum growth rate (default 0.9). Requires that the objective value is at least the fraction times maximum objective value. A value of 0.85 for instance means that the objective has to be at least at 85% percent of its maximum.
- reactions (list) – List of reactions to analyze (default: all).
- constraints (dic) – Additional constraints (optional).
- loopless (boolean) – Run looplessFBA internally (very slow) (default: false).
- internal (list) – List of internal reactions for looplessFBA (optional).
- solver – A pre-instantiated solver instance (optional)
- format – The return format: ‘dict’, returns a dictionary,’df’ returns a data frame.
Returns: A dictionary of flux variation ranges.
-
add_compartment
(comp_id, name=None, external=False)[source]¶ Adds a compartment
Parameters: - comp_id (str) – Compartment ID
- name (str) – Compartment name, default None
- external (bool) – If the compartment is external, default False.
-
add_metabolite
(id, formula=None, name=None, compartment=None)[source]¶ Adds a metabolite
Parameters: - id ([type]) – [description]
- formula ([type], optional) – [description], defaults to None
- name ([type], optional) – [description], defaults to None
- compartment ([type], optional) – [description], defaults to None
-
add_reaction
(rxn_id, name=None, stoichiometry=None, reversible=True, lb=-10000, ub=10000, gpr=None, objective=0, replace=True, annotations={}, reaction_type=None)[source]¶ Adds a reaction to the model
Parameters: - rxn_id (str) – The reaction identifier
- name (str, optional) – The name of the reaction, defaults to None
- stoichiometry (dict[str,float], optional) – The stoichiometry of the reaction, defaults to None
- reversible (bool, optional) – If the reaction is reversible or not, defaults to True
- lb (float, optional) – The reaction lower bound, defaults to ModelConstants.REACTION_LOWER_BOUND
- ub (float, optional) – The reaction upper bound, defaults to ModelConstants.REACTION_UPPER_BOUND
- gpr (str, optional) – The gene-protein-reactio rule, defaults to None
- objective (float, optional) – the objective coeficient, defaults to 0
- replace (bool, optional) – If replaces the reaction with a same identifier, defaults to True
-
environmental_conditions
¶
-
find_bounds
()[source]¶ Return the median upper and lower bound of the metabolic model. Bounds can vary from model to model. Cobrapy defaults to (-1000, 1000).
-
get_reaction_bounds
(reaction)[source]¶ Returns the bounds for a given reaction. :param reaction: str, reaction ID :return: lb(s), ub(s), tuple
-
metabolite_reaction_lookup
(force_recalculate=False)[source]¶ Return the network topology as a nested map from metabolite to reaction to coefficient. :return: a dictionary lookup table
-
objective
¶
-
remove_reaction
(r_id)[source]¶ Removes a reaction from the model.
- Args:
- r_id (str): The reaction identifier.
-
reverse_reaction
(reaction_id)[source]¶ Identify if a reaction is reversible and returns the reverse reaction if it is the case.
Parameters: reaction_id – A reaction identifier. Returns: A reverse reaction identifier or None
-
set_reaction_bounds
(reaction, lb, ub, track=True)[source]¶ Sets the bounds for a given reaction. :param reaction: str, reaction ID :param float lb: lower bound :param float ub: upper bound :param bool track: if the changes are to be logged. Default True
-
simulate
(objective=None, method=<SimulationMethod.FBA: 'FBA'>, maximize=True, constraints=None, reference=None, scalefactor=None, solver=None, slim=False, shadow_prices=False)[source]¶ Simulates a phenotype when applying a set constraints using the specified method.
Parameters: - objective (dic) – The simulation objective. If none, the model objective is considered.
- method – The SimulationMethod (FBA, pFBA, lMOMA, etc …)
- maximize (boolean) – The optimization direction
- constraints (dic) – A dictionary of constraints to be applied to the model.
- reference (dic) – A dictionary of reaction flux values.
- scalefactor (float) – A positive scaling factor for the solver. Default None.
- solver – An instance of the solver.
mewpy.simulation.simulation module¶
-
class
mewpy.simulation.simulation.
ModelContainer
[source]¶ Bases:
abc.ABC
Interface for Model container. Provides an abstraction from models implementations.
-
compartments
¶ Returns: A list of compartments identifiers.
-
genes
¶ Returns: A list of gene identifiers.
-
get_gpr
(rxn_id)[source]¶ Returns: A string representation of the reaction GPR if exists None otherwise.
-
medium
¶
-
metabolites
¶ Returns: A list of metabolite identifiers.
-
proteins
¶ Returns: A list of proteins identifiers.
-
reactions
¶ Returns: A list of reaction identifiers.
-
-
class
mewpy.simulation.simulation.
SStatus
[source]¶ Bases:
enum.Enum
Enumeration of possible solution status.
-
INFEASIBLE
= 'Infeasible'¶
-
INF_OR_UNB
= 'Infeasible or Unbounded'¶
-
OPTIMAL
= 'Optimal'¶
-
SUBOPTIMAL
= 'Suboptimal'¶
-
UNBOUNDED
= 'Unbounded'¶
-
UNKNOWN
= 'Unknown'¶
-
-
class
mewpy.simulation.simulation.
SimulationMethod
[source]¶ Bases:
enum.Enum
An enumeration.
-
FBA
= 'FBA'¶
-
MOMA
= 'MOMA'¶
-
NONE
= 'NONE'¶
-
ROOM
= 'ROOM'¶
-
lMOMA
= 'lMOMA'¶
-
pFBA
= 'pFBA'¶
-
-
class
mewpy.simulation.simulation.
SimulationResult
(model, objective_value, fluxes=None, status=None, envcond=None, model_constraints=None, simul_constraints=None, maximize=True, method=None, shadow_prices=None)[source]¶ Bases:
object
Class that represents simulation results and performs operations over them.
-
dataframe
¶
-
find
(pattern=None, sort=False, shadow_prices=False, show_nulls=False)[source]¶ Returns a dataframe of reactions and their fluxes matching a pattern or a list of patterns.
Parameters: - pattern – a string or a list of strings. defaults to None
- sort (bool, optional) – If the dataframe is to be sorted by flux rates. Defaults to False
Returns: returns a dataframe.
Return type: pandas.DataFrame
-
classmethod
from_linear_solver
(solution)[source]¶ Converts a solver Solution object to a SolutionResult object.
Parameters: solution (mewpy.solver.solution.Solution) – solution to be converted Raises: ValueError – if solution is not an instance of mewpy.solver.solution.Solution Returns: an instance of SolutionResult Return type: SolutionResult
-
get_constraints
()[source]¶ Returns: All constraints applied during the simulation both persistent and simulation specific.
-
get_metabolite
(met_id, format='df')[source]¶ Displays the consumption/production of a metabolite in the reactions it participates.
Parameters: - met_id (str) – the metabolite identifier.
- format (str) – the display format (pandas.DataFrame or dict). Default ‘df’ pandas.DataFrame
- Returns:
- dict or pandas.DataFrame: metabolite turnover rates
-
-
class
mewpy.simulation.simulation.
Simulator
[source]¶ Bases:
mewpy.simulation.simulation.ModelContainer
,mewpy.simulation.simulation.SimulationInterface
Interface for simulators
-
FVA
(reactions=None, obj_frac=0, constraints=None, loopless=False, internal=None, solver=None)[source]¶ Abstract method to run Flux Variability Analysis (FVA).
Returns: A dictionary of flux range values.
-
blocked_reactions
(constraints=None, reactions=None, abstol=1e-09)[source]¶ Find all blocked reactions in a model
Parameters: - constraints ((dict)) – additional constraints (optional)
- reactions ((list)) – List of reactions which will be tested (default: None, test all reactions)
- abstol ((float)) – absolute tolerance (default: 1e-9)
Returns: A list blocked reactions
-
essential_genes
(min_growth=0.01)[source]¶ Essential genes are those when deleted enable a biomass flux value above a minimal growth defined as a percentage of the wild type growth.
Parameters: min_growth (float) – Minimal percentage of the wild type growth value. Default 0.01 (1%). Returns: A list of essential genes.
-
essential_reactions
(min_growth=0.01)[source]¶ Essential reactions are those when knocked out enable a biomass flux value above a minimal growth defined as a percentage of the wild type growth.
Parameters: min_growth (float) – Minimal percentage of the wild type growth value. Default 0.01 (1%). Returns: A list of essential reactions.
-
evaluate_gprs
(active_genes)[source]¶ Returns the list of active reactions for a given list of active genes.
Parameters: active_genes (list) – List of genes identifiers. Returns: A list of active reaction identifiers.
-
find
(pattern=None, sort=False, find_in='r')[source]¶ A user friendly method to find metabolites, reactions or genes in the model.
Parameters: - pattern (str, optional) – The pattern which can be a regular expression, defaults to None in which case all entries are listed.
- sort (bool, optional) – if the search results should be sorted, defaults to False
- find_in (str, optional) – The search set: ‘r’ for reactions, ‘m’ for metabolites, ‘g’ for genes, defaults to ‘r’
Returns: the search results
Return type: pandas dataframe
-
get_metabolite_consumers
(m_id)[source]¶ Returns the list or reactions that consume a metabolite.
Parameters: m_id – the metabolite identifier Returns: A list of reaction identifiers
-
get_metabolite_producers
(m_id: str) → List[str][source]¶ Returns the list or reactions that produce a metabolite.
Parameters: m_id – the metabolite identifier Returns: A list of reaction identifiers
-
get_metabolite_reactions
(m_id: str) → List[str][source]¶ Returns the list or reactions that produce or consume a metabolite.
Parameters: m_id – the metabolite identifier Returns: A list of reaction identifiers
-
reference
¶ The reference wild type reaction flux values.
Returns: A dictionary of wild type reaction flux values.
-
simulate
(objective=None, method=<SimulationMethod.FBA: 'FBA'>, maximize=True, constraints=None, reference=None, solver=None, **kwargs)[source]¶ Abstract method to run a phenotype simulation.
Returns: A SimulationResult.
-
simulate_mp
(objective=None, method=<SimulationMethod.FBA: 'FBA'>, maximize=True, constraints_list=None, reference=None, solver=None, jobs=None, desc='Parallel Simulation', **kwargs)[source]¶ Parallel phenotype simulations.
Parameters: - objective ((dict)) – The simulations objective. If none, the model objective is considered.
- method ((SimulationMethod)) – The SimulationMethod (FBA, pFBA, lMOMA, etc …)
- maximize ((boolean)) – The optimization direction
- constraints ((dict)) – A dictionary of constraints to be applied to the model.
- reference ((dict)) – A dictionary of reaction flux values.
- solver ((Solver)) – An instance of the solver.
- jobs ((int)) – The number of parallel jobs.
- desc ((str)) – Description to present in tdqm.
-
-
mewpy.simulation.simulation.
simulate
(model, envcond=None, objective=None, method=<SimulationMethod.FBA: 'FBA'>, maximize=True, constraints=None, reference=None, solver=None, **kwargs)[source]¶ Runs an FBA phenotype simulation.
Parameters: model – cobrapy, reframed, GERM constraint-base model :param (dict ) envcond : Environmental conditions, defaults to None :param (dict) objective: the FBA objective , defaults to None :param method: The FBA method, defaults to SimulationMethod.FBA :param maximize: optimization sense, defaults to True :param (dict) constraints: additional constraints , defaults to None :param (dict) reference: reference fluxes, defaults to None :param solver: a solver instance, defaults to None
Returns: SimultationResult
mewpy.utils package¶
Submodules¶
mewpy.utils.constants module¶
mewpy.utils.crossmodel module¶
mewpy.utils.graph module¶
mewpy.utils.parsing module¶
mewpy.utils.process module¶
mewpy.utils.utilities module¶
Module contents¶
mewpy.visualization package¶
Submodules¶
mewpy.visualization.envelope module¶
-
mewpy.visualization.envelope.
flux_envelope
(model, r_x, r_y, steps=10, constraints=None, x_range=None, tolerance=0)[source]¶ - Calculate the flux envelope for a pair of reactions.
- Adapted from REFRAMED to be compatible both with REFRAMED and COBRApy.
:param model : The model or simulator. :param str r_x: Reaction on x-axis. :param str r_y: Reaction on y-axis. :param int steps: Number of steps to compute (default: 10). :param dict constraints: Custom constraints to the FBA problem. :param dict envcond: Environmental conditions. :param tuple range: x value range. Default None. :returns: x values, y_min values, y_max values
-
mewpy.visualization.envelope.
plot_flux_envelope
(model, r_x, r_y, steps=10, substrate=None, constraints=None, label_x=None, label_y=None, flip_x=False, flip_y=False, plot_kwargs=None, fill_kwargs=None, ax=None, x_range=None)[source]¶ - Plots the flux envelope for a pair of reactions.
- Adapted from REFRAMED.
Parameters: - model – The model or simulator.
- r_x (str) – Reaction on x-axis.
- r_y (str) – Reaction on y-axis.
- steps (int) – Number of steps to compute (default: 20).
- substrate (str) – Compute yields for given substrate instead of rates (optional).
- constraints (dict) – Additional simulation constraints.
- label_x (str) – x label (optional, uses reaction name by default).
- label_y (str) – y label (optional, uses reaction name by default).
- flip_x (bool) – Flip direction of r_x (default: False).
- flip_y (dict) – Flip direction of r_y (default: False).
- plot_kwargs (dict) – Additional parameters to pyplot.plot (optional).
- fill_kwargs (dict) – Additional parameters to pyplot.fill_between (optional).
- ax (matplotlib.Axes) – Plot over existing axes (optional).
- range (tuple) – x value range. Default None.
Returns: matplotlib.Axes: Axes object.
mewpy.visualization.escher module¶
-
mewpy.visualization.escher.
build_escher
(model=None, fluxes=None, fmt_func=<function remove_prefix>, **kwargs)[source]¶
mewpy.visualization.plot module¶
-
class
mewpy.visualization.plot.
Plot
(plot_title='Pareto Aproximation', reference_front=None, reference_point=None, axis_labels=None)[source]¶ Bases:
object
-
static
get_points
(solutions)[source]¶ Get points for each solution of the front.
Parameters: solutions – List of solutions where each solution is a list of fitness values Returns: Pandas dataframe with one column for each objective and one row for each solution.
-
pcoords
(fronts, normalize=False, filename=None, format='eps')[source]¶ Plot any arbitrary number of fronts in parallel coordinates.
Parameters: - fronts – List of fronts (containing solutions).
- filename – Output filename.
-
plot
(front, label='', normalize=False, filename=None, format='eps')[source]¶ Plot any arbitrary number of fronts in 2D, 3D or p-coords.
Parameters: - front – Pareto front or a list of them.
- label – Pareto front title or a list of them.
- normalize – If True, normalize data (for p-coords).
- filename – Output filename.
- format – Output file format.
-
static