Interoperability¶
We try to maximize interoperability between the PfLine and PfState classes and common Python classes. At the same time, we want the outcome of any operation to be unambiguous, without having to guess at or be surprised by its outcome.
Note
This section assumes you are familiar with Dimensions and Units
For our purposes, the most common data container is the timeseries, though on some occasions we are dealing with single values (e.g. a single, time-independent, price of 45 Eur/MWh). There are several ways for the user to provide this information.
In the code examples below, the following imports are assumed and variables are assumed:
import portfolyo as pf
import pandas as pd
idx = pd.date_range("2023", freq="YS", periods=2)
One value¶
To pass a single value, the following objects can be used:
A
floatorintvalue.A
pint.Quantity, which is unit-aware. For convenience, theQuantityclass (with the relevant unit registry) is available atporfolyo.Q_():pf.Q_(50.0, "Eur/MWh")
<Quantity(50.0, 'euro / megawatthour')>
The unit is converted to the default unit for its dimension once it is used in any of the
portfolyoobjects, see also this section further below.See pint’s website for more information about
pint.
Hint
Using a pint.Quantity expresses a more deliberate intent, and therefore allows us to catch dimensionality errors more easily. For dimensionless values, such as fractions, we could even use a dimensionless Quantity (though this quickly becomes cumbersome).
One or more values¶
If we have to specify several individual values, we can use:
A dictionary with the one or more of the dimension abbrevations (
"w","q","p","r","nodim") as the keys, andfloat,intorpint.Quantityinstances as the values. E.g.:{"p": 50.0, "w": pf.Q_(120, 'MW')}
{'p': 50.0, 'w': <Quantity(120.0, 'megawatt')>}Or we can use any other
Mappingfrom string values tofloat``s, ``int``s, or ``pint.Quantityobjects, e.g., apandas.Serieswith a string index. It is recommended, however, to useSeriesonly for timeseries information.
Note
Because we have to explicitly state the dimension abbreviation, these objects help us avoid dimensionality errors. For this reason, we may want to use them, even for single values.
One timeseries¶
Warning
To avoid unexpected behavior, timeseries (pandas.Series and pandas.DataFrame objects) should be of a certain form. See Preprocessing input data.
For timeseries, pandas.Series are used. These can be “unit-agnostic” (i.e., of datatype float or int), or unit-aware as in the example below. [1]
pd.Series([50, 56.0], idx, dtype="pint[Eur/MWh]") # unit-aware
2023-01-01 50.0
2024-01-01 56.0
Freq: YS-JAN, dtype: pint[Eur/MWh]
Warning
The name attribute of a pandas.Series is always ignored.
One or more timeseries¶
To pass several timeseries, we can use:
A dictionary with the one or more of the dimension abbrevations (
"w","q","p","r","nodim") as the keys, and timeseries as the values. E.g.:{"p": pd.Series([50, 56], idx), "w": pd.Series([120, 125], idx, dtype="pint[MW]")}
{'p': 2023-01-01 50.0 2024-01-01 56.0 Freq: YS-JAN, dtype: float64, 'w': 2023-01-01 120.0 2024-01-01 125.0 Freq: YS-JAN, dtype: pint[MW]}Each of the timeseries can have a unit or be unit-agnostic.
Or we can use any other
Mappingfrom string values to timeseries, e.g., apandas.DataFramewith a datetime-index. In this case:pd.DataFrame({"p": [50, 56], "w": [120, 125]}, idx)
p w 2023-01-01 50.0 120.0 2024-01-01 56.0 125.0
Dataframes can also be made unit-aware. [2]
Note
The same applied here: because we have to explicitly state the dimension abbreviation, these objects help us avoid dimensionality errors. For this reason, we may want to use them, even for single timeseries.
Combinations¶
Dictionaries are the most versatily of these objects. They can be used to pass a single value, multiple values, a single timeseries, multiple timeseries, or a combination of these:
d1 = {"p": 50}
d2 = {"p": 50, "w": 120}
d3 = {"p": pd.Series([50, 56], idx)}
d4 = {"p": pd.Series([50, 56], idx), "w": pd.Series([120, 125], idx)}
d5 = {"p": pd.Series([50, 56], idx), "w": 120}
Duck typing for other objects¶
Any object can be used, as long as it has an .items() method returning (key, value)-tuples (e.g. if it inherits from the Mapping abstract base class and therefore implements __getitem__, __iter__ and __len__ methods), and all keys are valid dimension abbrevations.
Compatilibity of abbrevation and unit¶
Information can have a key (one of the dimension abbrevations: "w", "q", "p", "r", "nodim") and/or a unit. In a DataFrame, a timeseries’ key is the corresponding column name. A timeseries ‘by itself’ has no key; its name is ignored.
There is a one-to-one relationship between dimension abbrevation and unit; see Dimensions and Units.
In some of the objects discussed above, we specify both a key and a unit. In that case,
portfolyochecks if the unit has the correct dimensionality. If so, but it is not the default unit, a conversion to the default unit is done.E.g., the key
"p"and unitctEur/kWhof{"p": pd.Series([5.0, 5.6], idx, dtype="pint[ctEur/kWh]")}are consistent. The values will be changed to the default unit (=Eur/MWh) upon further processing. Using"q"instead of"p"results in a dimensionality error, and using"x"results in a KeyError.In some objects, only the unit is specified. Here, the dimension is deduced from the unit, and the unit is converted into the default (if necessary).
E.g., the timeseries
pd.Series([5.0, 5.6], idx, dtype="pint[ctEur/kWh]")(NB: without the dictionary key) is such an object.In other objects, only the key is specified. In that case, the unit is deduced from the key - the default unit is assumed.
E.g., the key
"p"of{"p": pd.Series([50, 56], idx)}indicates that we are dealing with prices, and the default unit of Eur/MWh is assumed.If both are not provided, the dimension must be inferrable from the context, and the unit is assumed to be the default for that dimension.
E.g. when adding a
floatvalue to aPfLinecontaining prices, the value is assumed to also be a price, in the default unit (= Eur/MWh).