# HR Policies and Maternal Labor Supply

## The Example of Employer-Supported Childcare

#### Series:

## Susanne Schneider

The author asks how far the extension of employer-supported childcare serves as a driver for higher maternal labor supply. She addresses this question by categorizing employer-supported childcare as an efficiency wage introduced by the employer to increase the working volume of mothers. Applying various impact evaluation techniques in an econometric analysis, the author concludes that the availability of employer-supported childcare has a positive impact on the length and working volume of mothers who return back to work after giving birth. Furthermore, the usage of employer-supported childcare by mothers with pre-school age children influences the amount of agreed and actual working hours positively.

# 5. Research methodology: Measuring the Effects of ESCC

5. Research methodology: Measuring the Effects of ESCC

5.1 Research objectives and questions

The goal of this thesis is to provide a comprehensive overview of the effects of ESCC on the employment rates of mothers and answer the question if mothers who are working for employers with ESCC are more likely to increase their working behavior than mothers whose employer does not offer ESCC.

The operationalization of “work behavior” is important here. Within the field of impact evaluations, work behavior can be analyzed in two different ways: firstly, it can be measured in terms of quantifiable work output. This implies for instance working time in terms of working volume, time-to-return-to-job after work interruptions, and productivity. Since productivity is often hard to measure (except for working on assembly lines), the working time is often taken as an alternative measurement. Secondly, it can be traced in working attitude. The working attitude might be measured in terms of job satisfaction, turnover intentions, or work-life conflicts, according to Cropanzano and Mitchell (2005). While the quantifiable work output is often analyzed with micro data of employees, the working attitude is mainly investigated with qualitative questionnaires. This thesis approaches the working behavior in terms of quantifiable output, meaning the working interruption per se and the working volume.

Distinct sub-research questions have been developed to aid in answering this question in a productive manner. Work behavior will be explored for each of the distinct stages of being a mother. It is differentiated between the return to work after childbirth and the working volume of mothers with young children. Possible benefits to the mother from ESCC, if used by herself or her partner, are examined. While the previous applications of work behavior focus explicitly on the working time as measured in quantitative terms, the theoretical framework has already hinted that the effects might go beyond this definition. One sub-research question addresses the qualitative effects of ESCC in terms of the working attitude. To be able to discuss the effects of ESCC for maternal labor supply comprehensively and analyze its implications, effects of ESCC for the provider will also be considered. As the theoretical framework demonstrates, the employers provide efficiency wages for employees since they expect a greater working effort in return. Accordingly, while it is not the focus of the thesis, a distinct part of the analysis is dedicated to explore the effects of ESCC on the employer in monetary terms.

i. What is the effect of ESCC on the duration of work interruption after childbirth?

ii. What is the effect of ESCC on the type of employment (marginal, part-time, full-time) after childbirth? ← 117 | 118 →

iii. Does ESCC have an effect on the working hours when a mother has young children?

iv. Is there an impact of ESCC on the work behavior of mothers when the employer of the father offers ESCC?

v. Are there further effects of ESCC which go beyond the narrow measurement of labor supply for mothers?

vi. Are there additional positive effects of ESCC for the provider of ESCC, which is important since ESCC is offered voluntarily?

The research strategy can be categorized as explanatory research. The preliminary goal deals with the identification the relationship between two variables and the identification of possible additional factors influencing the relationship. Hence, it is questioned whether a causal relationship exists between two variables. Both the dependent variable as well as potential causes are able to change over time (Babbie, 2015) requiring the application of a deduction base approach. This approach implies a top-down analysis. A broad theoretical framework is presented to derive specific hypothesis from it. This hypothesis will be tested by means of an econometric analysis to assess whether there is a causal impact. Finally, the hypothesis can be confirmed or rejected leading to a comprehensive conclusion regarding the relationship between both variables (Shadish, Cook, & Campbell, 2002). The time frame of this study covers the years from 2009 to 2012. The time span is given by the data set. The data collection took place after the parent’s money reform in 2007, but before the introduction of parent’s money plus in 2015. Furthermore, at the end of the data collection process, the law on the right for childcare facilities for children older than one year had been introduced in Germany. The specific time span of the dataset might be in itself a distinct factor for the analyzed working behavior of mothers. The research strategy inhabits that several pre-tests have been conducted to exclude variables in the statistical analysis, which do not inhabit explanatory power. Consequentially, variables, which could hypothetically be used are excluded beforehand. These include the commute distance as well as the company size. Further variables have been excluded in distinct specifications. Furthermore, pre-tests on interactions of the control variables, for instance living in former East or West Germany and the income, have been made to reveal whether there are significant dependencies between them. The pre-tests revealed, however, that they should not be included in the final regression specifications.

To estimate the relationship between two variables, the potential influence of additional factors must be excluded. ← 118 | 119 →

Figure 5-1: Schematic illustration of relationship between two variables and third factors

Source: Own design

The above demonstrates that there might be additional factors influencing a relationship between two variables. In this context, it might be that mothers react differently on ESCC in dependence on the local childcare situation. For instance, mothers living in an area with an above-average kindergarten in terms of opening times might value ESCC less than mothers who are dependent on restricted opening times. Based on the theoretical framework and the literature review, potential third factors have been identified and are included in the econometric model.

There are two further challenges which hamper the estimation of the true impact of ESCC and maternal labor supply. Therefore, the remaining part of this chapter deals with the discussion of the relevance of a control group and the problem of self-selection.

The major challenge of impact evaluation “is to determine what would have happened to the beneficiaries if the program had not existed” (Khandker, Koolwal, & Samad, 2010, p. 43). The outcome for an observation without the policy intervention would be counterfactual or a hypothetical. The challenge is therefore to assess the true impact of policy intervention independent of other factors by comparing actual and counterfactual outcomes with the obstacle that the counterfactual outcome can only be speculated. Ideally, one would be able to observe the two different states simultaneously. Since this is not possible, one is forced to find an appropriate control group. There are generally two ways to find an appropriate control group, which can also be combined. One is to compare the behavior of individuals benefitting from the policy intervention, and individuals who do not have access to it. Here, it is advantageous that the comparison is conducted at the same time. It can be assumed that time-related factors do not influence the effect. The other way is to compare the behavior of individuals benefiting before and after the introduction of the policy intervention. One advantage here is that unobservable factors can be considered. For instance, it might be that certain preferences of mothers are relevant for the reaction towards ESCC (Khandker et al., 2010). The following equation represents the basic evaluation problem ← 119 | 120 →

*Yi = αXi + βTi + εi* (5.1)

whereby *Y* denotes the outcome of the policy intervention, *i* represents the treated and non-treated individuals, *T* is a dummy variable indicating whether one benefits from the policy intervention or not and *X* refers to other observable characteristics of the observations. *ε* is the error term, including unobservable characteristics which are also affected by *Y*. This equation reflects the basic approach of measuring the direct effect of the program *T* on the outcome *Y*. The problem of the equation is that the assignment of policy intervention might not be random since the policy might be placed on purpose by the provider or because observations self-select the program. The first point inhabits that the polices are introduced due to special needs and thus, just previously defined observations benefit from the program. The second point highlights that individuals self-select the program due to unobservable or observable characteristics. In case of unobservable factors, the error term would contain further variables, which are also correlated with the treatment variable. Since they are unobservable (for instance certain preferences), they cannot be measured and lead to bias. Therefore, one of the key assumptions of ordinary least square estimations is violated and leads to biased estimates. This is assuming the independence of regressors from the disturbance term and is stated as *cov (T, ε) ≠ 0*. The correlation between the treatment and the error term naturally biases other estimates in the equation, which consequentially leads to an outcome, which is biased as well. One way to deal with this approach is to assume that selection bias disappears if “one could assume that whether or not households or individuals receive treatment (conditional on a set of covariates, X) were independent of the outcomes that they have” (Khandker et al., 2010, p. 48). The assumption of not being confounded is also known as the conditional independence assumption.

Two ways to address these presented challenges are discussed and applied in section 5.4. “Impact evaluation techniques and applications”. Beforehand, the data set is described.

The main requirement for the data set concerns the identification of variables on ESCC. There are currently two data sets which would fulfill this requirement. The linked employee-employer dataset is the only database in Germany, which provides representative micro data on employer and employees simultaneously. It includes variables on gender, age, nationality, qualification, education, working volume, economic sector of establishment, place of residence and work or first day in establishment. Some variables can be used to derive further information. For example, extraordinary wage increases can be identified as indicators for the uniqueness of an employee. The linked employee-employer dataset has an extra questionnaire on a regular basis regarding family-friendly HR policies including the question “Does the company offer childcare assistance?” This questionnaire has been used in the years 2002, 2004, 2008 and 2012 (Heining, Scholz, & Seth, 2013). ← 120 | 121 →

The alternative data set is FiD. This data set is an integrated part of the Socio-Economic Panel for Germany (SOEP). Data collection took place between 2010 and 2013. It was established for the study “Zur Gesamtevaluation der ehe- und familienbezogenen Leistungen in Deutschland”, which was conducted and published in the year 2014 by the BMFSFJ (2014). This new data set represents data sets for the following societal groups not sufficiently covered in the SOEP: single parents, low income families, large families with three or more children, and families with particularly young children. The data set includes information on household characteristics, education, past and current labor market experiences, earnings and income, housing characteristics, health, select preferences, and life satisfaction. In addition, there is a stronger focus on children and partnership. This implies that mothers and fathers are asked specific questions regarding their childcare decisions and ensuing satisfaction. Specifically, they were asked whether the employer offers on-site childcare, rents childcare spots in local kindergartens, and/or offers financial support. Thereby, it is differentiated between availability and usage of ESCC (Schröder, Siegers, & Spieß, 2013).

The FiD data was favorably used here, because it offers more detailed questions in regards to ESCC. In addition, the FiD contains specific information on childbirth and childcare. For instance, the date of birth of children is not included in the linked employer-employee data set, but need to be estimated by means of the social insurance reports of the employees. The FiD also provides a large amount of questions on the household and individual circumstances of an employee. Some of these pieces of information are included in the linked employer-employee data set as well, but they are asked when the person is entering the data set and not updated afterwards. Altogether, the linked employer-employee data set would offer more information on the employer, while the FiD inhabits rarely basic ones (for instance economic sector). Hence, for the purpose of answering the above stated research question, the FiD seems to be more suitable. In case of answering a research question on the introduction of ESCC as itself, the linked employee-employer data set would probably be more suitable.

**5.4 Impact evaluation techniques and application**

This sub-section describes the econometric models applied in the subsequent empirical analysis. After the description of the models, the models will be applied to the situation at hand. With regard to the duration of the work interruption after childbirth, EHA is used. A CRM aids in analyzing the extent of the working volume when returning to work. Both models use the same framework and are longitudinal. Concerning the return to work after childbirth, two different kinds of models are used to include robustness checks since each model has its own shortcomings. They are called PSM and DiD estimations. ← 121 | 122 →

**5.4.1 Time and extend of return-to-job after giving birth**

*5.4.1.1 Event history analysis and competing risk models*

EHA focuses on changes which happen at a specific point in time to a sample of individuals (Allison, 1984). The following description of EHA is mainly derived from Teele (2016) and Cleves, Gould, and Marchenko (2016).

In EHA, observations can be either regarded as left- or right-censored. Right-censored cases do not experience the failure within the time set available, while left-hand censored cases do not experience the starting point of the analysis within the dataset. The main concept of EHA is the hazard rate, which estimates the conditional probability that an event will occur, given that it has not already occurred. Thereby, the hazard rate *H(t)* can be regarded as the cumulative distribution function of the probability that a certain event will occur. The hazard rate has two main mathematical components: the survival function S(t) and the probability of failure f(t). These terms are adopted from the natural sciences, where EHA has firstly been applied to analyzing the life cycle of living organisms. Let *T* be a random variable, the probability, that an individual will experiences an event at time_{j }is found in the failure function, where *j* indicates the time span (for instance measured in months):

*f(t) = Pr(T = t _{j}) * (5.2)

The survival function as the second part of the hazard is defined as

*S(t) = *Pr*(T ≥ t _{j})*

*= ∑*(5.3)

_{j=i}f(t_{j})The hazard rate is the mathematically stated ratio of the probability from failure to survival. This can also be expressed in other terms, namely conditional probability of survival given that a failure has not already occurred. There are two ways to express this relationship.

The last equation reveals the rate at which an individual experience the failure conditional on their survival until j. The inclusion of time-varying explanatory variables x_{i,j} lead to the following augmentation of the hazard:

The survivor function is also applicable for subgroups of the sample to compare the developments of the time of an observation in connection to certain characteristics. ← 122 | 123 → Therefore, the Kaplan-Meier estimator is a common tool. This estimator is the “product of the percent of observations in the sample that survive each period” (Teele, 2016, p. 6). Mathematically, the Kaplan-Meier estimator can be expressed in the following way for each sub-group:

Where *n _{j}* represents the number of observations at risk of failure at time,

*t*and

_{j}*r*denote the number of observations that perceived the failure within period

_{j}*t*. In other words, the Kaplain-Meier survivor represents the cumulative survival point at any given point.

_{j}With regard to the regression analysis with the inclusion of time-invariant covariates, the Cox proportional hazard model is generally chosen for the specification of the regression models. It leaves the baseline hazard unspecified, allowing the researcher to obtain estimates of the covariates for impact evaluation without having to make constraining assumptions about the distribution of event occurrence times. Therefore, the Cox proportional hazards model was chosen for two reasons: the hazard function does not have to be specified as it relates to time and the values of explanatory variables can change over time. The Cox model “is called the proportional hazards model because for any two individuals at any point in time, the ratio of their hazards is a constant” (Allison 1984, 33–34).

The EHA as it has been presented here analyzes the context if an individual is able to decide between two competing risks. However, it might be also the case that it is relevant to analyze the choice between several competing risks. In CRM, an individual can potentially fail from any of several K event types, but only the time to failure for the earliest (in time) of these (or the last follow-up time if no failure has yet occurred) is observed. These events are mutually exclusive. Here, the two following denotations are relevant which are taken from the observations. T denotes the failure time and *δ* denotes the failure status indicator that inhabits either the type of failure that occurred or indicates that no failure has occurred. Thereby, two different sets of information can be acquired although only one event time is recorded. For instance, if someone failed from *δ* = cause A at T= 2 months, it is known simultaneously that the person has achieved 2 months free of failure from cause 2.

The cause-specific hazard function *λ _{k}(t)* is the principal identifiable quantity in competing risk observations. It represents the probability of failure due to cause k at a moment in time, given that no other failure of any kind has occurred previously. The cause-specific hazard summed from start of observation to time is denoted

*Λ*as the cumulative cause-specific hazard. The following equation inhabits the cumulative hazard function for failures from any cause. The cause-specific hazards for k events are additive to the hazard of the failure from any of the events, leading to the following equation ← 123 | 124 →

_{k}(t)Λ*(t) = Λ _{k}(t) + Λ_{k}(t) + Λ_{k}(t) … . Λ_{k}(t)* (5.8)

The corresponding survival function S(t) is defined as the probability of remaining event-free past time t. Hence, an individual experiences no event. It is denoted as S(t) = exp(-*Λ*(t)) (Cleves et al., 2016).

In the context of maternal employment, EHA is often applied to analyze the time of a mother’s entry into paid employment after childbirth. For instance, Loft and Hogan (2014) analyze the effect of care availability on a mother’s entry into paid employment in the USA after childbirth. They show that use of non-parental childcare prior to employment is independently and positively associated with entry into maternal employment.

Here, the unit of analysis consists of all mothers who gave birth in the observed time period. Thereby, a comparison is made between mothers who were working in a company providing ESCC before giving birth, and mothers, who do not have an employer offering ESCC. The time spells are counted in months. The month of giving birth is regarded as the zeroth spell and continuously counted for each observation distinctively. The greatest amount of possible spells is 48 if a mother gave birth in the first month of the prevailing dataset. The unit of analysis focuses on mothers who gave birth for the first time or if the newborn child has siblings, who are no longer at pre-school age. This decision ensures a consistent interpretation of the results. For instance, if a mother is already using a kindergarten for her three-year old child, the provision of ESCC might not be as influential as for mothers who give birth for the first time. A similar approach is chosen by Fitzenberger, Steffes and Strittmatter (2015). It should be noted that mothers who are receiving the parents’ money, are still allowed to work up to 30 hours per week (Wrohlich et al., 2012)^{14}. Hence, mothers who are subject to the parents’ money are included in this analysis.

Since there is a wide range of possible independent variables and to develop the most powerful model, the tests of equality across strata are considered to explore whether or not to include them in the final model. For the categorical variables, the log-rank test of equality across strata has been used. For the continuous variables, a univariate Cox proportional hazard regression has been used. They are included if the test has a p-value of 0.25 or less. If the variable has a p-value greater than 0.25 in a univariate analysis, it is highly unlikely that it will contribute anything to a model which includes other variables (Cleves et al., 2016).

To test the effects of the hypothesis on the regression model, it is important to consider the effect of the hypothesis in combination with the provision of ESCC. Therefore, interaction terms are used, when an independent variable is assumed to have a different effect on the outcome depending on the values of another independent variable. For instance, the interaction “Availability ESCC * Usage of HR policies” ← 124 | 125 → is included in the regression. The resulting coefficient is the effect of usage of HR policies in dependence on the availability of ESCC on the outcome variable. The theory has highlighted that HR policies might be influential as well. Hence, usage of HR policies is included (Stock & Watson, 2007).

**5.4.2 Working volume with pre-school children**

*5.4.2.1 Propensity score matching*

PSM is a commonly used method to assess the effect of a policy on an outcome variable. In a nutshell, a treatment and a control group are matched according to similar observable characteristics based on a propensity score. The only difference between both groups is that the treatment group benefits from the introduction of a policy and the control group does not (Heckman, LaLonde, & Smith, 1999). The following description is derived from Khander, Koolwald and Samad (2010).

The construction of a statistical comparison group in PSM derives from a model regarding the probability of participating in the treatment T conditional on observed characteristics T, or the propensity score: *P(X) = Pr(T = 1|X).* It is thus necessary to assume the conditional independence and the presence of a common support.

The conditional independence assumptions, also seen as not confounded, highlight that potential outcomes *Y* are independent of the assignment to the treatment *T*, given selected observable variables *X*, which are not impacted by the treatment. The conditional independence implies that *Y _{i}^{T}* denoting the outcome for the treatment group and

*Y*denoting the outcome for the members of the control group

_{i}^{C}*(Y _{i}^{T} , Y_{i}^{C})*⊥

*T*(5.9)

_{i}| X_{i}This assumption is not directly testable since it depends on the distinct features of the policy. There are unobserved characteristics which influence the participation in policies, conditional independence will be violated. Having a dataset with many variables helps to support the conditional independence assumption since one is able to control for several observable characteristics which might affect being in the treatment under the assumption that unobserved selection is limited.

The second assumption, called common support or overlap condition, ensures that the members of the treatment group have members in the control group nearby in the propensity score distribution

*0 < P(T _{i} = 1|X_{i})* (5.10)

A substantial region of common support requires that the amount of members in both treatment and control groups are roughly equal. The more equal both groups, the higher is the effectiveness of PSM. This assumption includes that some members of the control group might have to be dropped if they have no similar counterparts in the treatment group in terms of observed characteristics unaffected by the usage of the treatment policy. In case that a member of the treatment group needs to be dropped, a potential sampling bias must be considered. ← 125 | 126 →

Applying PSM requires the estimation of program participation, meaning that participation T is estimated on all observable variables X that are likely to influence participation. Therefore, a logit model can be used to compare the outcomes for the members in the treatment group (T=1) with the members in the control group (T=0). Afterwards, the predicted values of T from the participation equation can be taken. The predicted outcomes can be interpreted as the estimated probability of participation in the treatment group, also known as the propensity score. Every individual in either treatment or control acquires an estimated propensity score denoted as *Pˆ (X|T _{i} = 1) = Pˆ X).* Furthermore, one needs to define the region of common support and conduct balancing tests. The definition of the region of common support refers to overlapping of the distribution of the propensity score for the treatment and control group members. This issue implies that some observations might need to be dropped if they have extraordinary values in one of the observable covariates. The balancing tests inhabit that it needs to be checked whether

*Pˆ (X|T = 1) = Pˆ (X|T = 0)*. This implies that the average propensity score and the mean of X are similar within each quantile of the propensity score distribution. It might be that that a matched couple consisting of observations in the treatment and control group have a similar propensity score, but misspecification exists. Hence, it will be checked whether their scores are based on similar observed X. It is also necessary to match participants to nonparticipants. The matching process is conducted with two different approaches for robustness purposes. Firstly, radius matching ensures that each individual of the treatment group is matched with persons from the control units when they have a propensity score in a “predefined neighborhood of the propensity score of the treated unit” (Becker & Ichino, 2002, p. 361). Thereby, the size of the radius can influence the likelihood of including or excluding certain control units. For instance, control units with an extreme propensity score might be excluded. Kernel matching is taken as a second method. Thereby, all units of the treatment group are matched when they have a “weighted average of all controls with weights that are inversely proportional to the distance between the propensity scores of treated and controls” (Becker & Ichino, 2002, p. 362). Finally, the average treatment impact can be assessed by taking the mean difference in outcomes over the common support, weighting the comparison units by the propensity score distribution of the participants.

*5.4.2.2 Difference-in-Difference estimator*

The differences-in-differences estimator (DiD) allows unobserved heterogeneity, which could otherwise influence the selection bias. Thereby, the DiD estimator “compares treatment and comparison groups in terms of outcome changes over time, relative to the outcomes observed for a pre-intervention baseline” (Khandker et al., 2010, p. 72). The DiD estimator estimates the average impact of a policy as follows:

*DiD = E(Y _{1}^{T} − Y_{0}^{T} | T_{1} = 1) − E(Y_{1}^{C} − Y_{0}^{C} | T_{1} = 0)* (5.10)

Here, t=0 is defined as the time period before the intervention and t=1 equals the time-period after the intervention. *Y _{1}^{T}* and

*Y*define the outcomes for the members ← 126 | 127 → of the treatment group and respectively the control group. The comparison of treatment and control group members before and after the introduction of a policy are the essential components of the DiD. Hence, baseline data before the intervention allows an estimate for the impact under the assumption that unobserved heterogeneity is time invariant and is uncorrelated with the treatment over time. Under this assumption, the control group with individuals who do not benefit from the treatment policy (that is,

_{1}^{C}*E(Y*) can be used as an appropriate counterfactual for the treatment group (that is,

_{1}^{C}− Y_{0}^{C}| T_{1}= 0)*E(Y*). The following figure shows a schematic illustration of the calculation of a DiD outcome.

_{1}^{T}− Y_{0}^{T}| T_{1}= 1)Figure 5-2: Schematic illustration of DiD analysis

Source: Own design on the basis of Khander et al (2010)

Stating it in descriptive terms, measuring the average difference in outcome for treatment and control group members distinctively across the board and afterwards calculating the difference between the average changes in the outcomes of both groups will lead to the impact called DiD. In mathematical terms, it would be stated as DiD = (Y_{4} – Y_{0}) – (Y_{3 }– Y_{1}).

The lowermost line depicts the true counterfactual outcome, which is actually never observed. The figure shows that the DiD approach always assumed that there are unobserved characteristics which create a gap between the measured control outcomes and the true counterfactual. However, it is assumed that this gap is time invariant as it is demonstrated by the two trends in the figure. This assumption can be stated in the following terms: (Y_{3} – Y_{2}) = (Y_{1} – Y_{0). }Using the equality in the preceding DiD equation, the consequentially derived outcome would be DiD = (Y_{4} – Y_{2}). Applying the DiD estimator in a regression framework would lead to the following equation

*Y _{it} = α + βT_{i1} t + ρT_{i1} + γt + ε_{it}* (5.11)

The average DiD estimate of the program is included in the coefficient *β* in the interaction of the post-program treatment variable *T _{i1}* and time (

*t*= 1…T). Hence, ← 127 | 128 →

*β*equals DiD from the equation 5.1. The constant is denoted as

*α*and the error term is denoted as

*ε*. In addition to the interaction term, the variables

*T*and

_{i1}*t*are included distinctly to acquire the independent results for the effect of the time between the pre-treatment and post-treatment period as well as the effect of being in the treatment group compared to the control group. Combining both approaches to introduce the intuition behind equation 5.2., the following two equations in expectation form arise:

*E(Y _{1}^{T} − Y_{0}^{T}│T_{1} = 1)=(α + DiD + ρ + γ) − (α + ρ)* (5.12.a)

*E(Y _{1}^{C} − Y_{0}^{C}│T_{1} = 0)=(α + γ) − α* (5.12.b)

In accordance with equation 5.1., the DiD estimator can be derived from subtracting 5.3.b from 5.3.a. Thereby, DiD can just be interpreted as an unbiased estimator if the selection bias is essentially additive and time invariant. Going further, calculating a policy impact through a simple pre- versus post-treatment design on treatment group members would be DiD + *γ*, which would consequentially lead to a bias in form of *γ ^{2}.* The inclusion of a control group ensures that other influences are not affecting the outcome for the participants. Another approach may include a comparison of post-treatment outcomes for the treatment and the control group members. Here, the result would be DiD +

*ρ*, leading to a bias of

*ρ.*The separation of systematic bias from the treatment would not be possible. To interpret the correctly, the following two assumptions must hold.

(1) The specification of the equation for the DiD estimator must be correct.

(2) The error term is uncorrelated with other variables in the equation following

a. *Cov(ε _{(it,)} T_{i1}) = 0*

b. *Cov(ε _{(it,)} t_{i1} ) = 0*

c. *Cov(ε _{(it,)} T_{i1} t) = 0*

The last assumption, also known as the parallel-trend assumption is of necessity here. It states that unobserved characteristics influencing the program participation to do change over time with the treatment status. This description is based on Khandker et al. (2010), however also described and used in various other studies on impact evaluation (Card & Krueger, 1994; Hirano, Imbens, & Ridder, 2003; Ravallion, Galasso, Lazo, & Philipp, 2005).

PSM will be applied in the following way: mothers who are working in a company with ESCC are matched to mothers who are working in a company without ESCC. Then, differences in the working behavior can be assessed. This difference can be interpreted as the impact of ESCC on the working behavior and is called the average treatment effect. ← 128 | 129 →

It is important to check during the matching process whether the common support assumption is fulfilled. In order to comply there needs to be sufficient overlap in the characteristics of treated and untreated units in form of the propensity score to find adequate matches (Khandker et al., 2010). The following figure contains the propensity scores of the treatment and control groups when ESCC is available for mothers.

Figure 5-3: Propensity scores for both groups when ESCC is available for mothers

Source: FiDv4.0, own calculations and design

Since the propensity scores of both groups are clearly left biased, the common support assumption may be regarded as fulfilled. As already mentioned, PSM will be applied in combination with DiD as well. Both methods have limitations – the assumptions of time-invariant unobserved heterogeneity with regard to the DiD estimator, and the assumption of only observable characteristics concerning PSM. Therefore, PSM is not only used as the only method, but also in combination with DiD (Khandker et al., 2010). Using the DiD estimator also allows the inclusion of independent variables in the analysis, which is not suitable in PSM calculations.

For DiD estimations, it is important to distinguish between a pre-treatment and a post-treatment period. Hence, it must be possible to separate clearly the two kinds of periods. DiD estimators are often used if there is a naturally defined pre- and post-treatment time like the introduction of a policy. Since the dataset does not allow identifying when ESCC is introduced in a company for a mother, it will be approached in a different way: The dataset has information on job changes, meaning that each observation in the dataset needs to report if he or she is changing the employer. Hence, it can be identified whether a mother benefited from ESCC since the new employer offers it, but the former employer did not. The identification of a pre- and post-treatment period due to a job change has frequently been used (Lauber & Storck, 2016). This assumption implies that three different time periods could be ← 129 | 130 → observed – from first year to the second year, from the second year to the third year and from the third year to the fourth year. Thereby, it will be checked for every time period whether or not the person has changed jobs. Diminishing the likelihood of mistaking such effects as ESCC based while depending on other factors (for instance higher salary), several control variables are included in the regression model.

Khandker et al (2010) propose the use of fixed effects (FE) next to OLS regressions to control for unobservable factors in panel data analysis. A short description of the methodological approach of FE is included in the Appendix. The dependent variable is working hours. The different ways of operationalization are provided in the following section.

Several authors (for instance Butts, Casper and Yang (2013), Grover and Crooker (1995)) showed that the positive effect of the providing family-friendly HRM can be on the same level, independent of whether the employee actually used it or was just aware of the possibility of using it. It seems possible that part of the effect is driven by the offer, because as long as the offer is continual it increases future possibilities of using it. The following estimates thus assume the availability and the usage of ESCC.

Using the DiD-estimator implies that an appropriate control group must be found. The dataset allows several approaches, which are displayed in the following figure.

Figure 5-4: Overview of sensitivity tests on the direct impact of ESCC

Source: Own design based on Khander et all (2012)

The figures reveal the differences between the first and second specification of the regression model to estimate the impact difference with regard to the treatment group, but have the same control group. The first treatment group is marked by the availability of ESCC in the post-treatment period, while the second treatment group actually uses ESCC in the same period. In dependence ← 130 | 131 → on the first specification, the first sensitivity check inhabits a different control group. Here, ESCC is available in both pre- and post-periods. Hence, this sensitivity test addresses whether the mere introduction of ESCC makes a different. The second sensitivity test focuses on the usage of ESCC, as the members of the treatment group do not experience ESCC in the pre-treatment period, but use it in the post-treatment period. The control group members use ESCC in both periods. In comparison to the first sensitivity check, it interrogates whether usage or availability lead to the same results. With regard to the third sensitivity test, the treatment group changes from the availability of ESCC to the use of ESCC while the control group experiences the availability in both periods. The fourth specification is characterized by the change from the availability to the usage of ESCC for the treatment group. The control group uses ESCC in both cases. The different specifications are used for robustness purposes of the main specification.

**5.5 Operationalization of variables**

This section describes the operationalization of the variables. Firstly, the operationalization of the dependent variables will be pursued, followed by the focus variable, ESCC. Finally, several independent variables, differentiated between hypothesis-related variables and further control variables, will be examined.

Concerning the dependent variable, the working behavior is analyzed for two periods over the life cycle of a mother, which are distinctively handled in the context of this analysis. The theoretical framework has highlighted that employers want to increase the working effort of employees through efficiency wages. The working effort of mothers can be operationalized in different ways. Firstly, the decision to return to work after childbirth is investigated. This part includes both when and to which extend (working volume) a woman returns to work after childbirth. FiD allows observing the working behavior of mothers on a monthly basis. Also, the date of birth of children is exactly given. With regard to the working volume on a monthly basis, the data set allows the differentiation between full-time, part-time, and marginal employment. Within the observed time span, the max amount of salary for marginal employment is 400 Euro per month. A specific distinction between full-time and part-time employment in dependence on the working hours is not included in the documentation of the data set (Schröder et al., 2013).^{15} Secondly, the working volume of mothers with young children is analyzed. Thereby, the FiD allows two different kinds of measurement. One variable contains the agreed working hours per week. The second variable measures the average amount of hours devoted to work per day. This measurement includes, next to the agreed working time, the commute, preparation for ← 131 | 132 → work, and overtime. Individuals who are working part-time (for instance two out of five days) are asked for their average time per day. The analysis was pursued for both variables.

With regard to the operationalization of ESCC, the description of various kinds of ESCC (see section 2.3.3 “The state of ESCC), the literature review (see section 3.3.1. “ESCC”) and the theoretical framework (see section 4.2. “ESCC conceptualized as an efficiency wage”) revealed that the distinct kinds of ESCC will be aggregated in the context of the prevailing analysis. Especially the theoretical framework revealed that the difference between monetary and non-monetary efficiency wages is relevant. Here, rather different kinds of non-monetary efficiency wages are analyzed. More specifically, the FiD allows the differentiation between

- On-site childcare
- Operator of renting childcare spots in local childcare facilities
- Financial support (for instance in form of vouchers).

With regard to the hypothesis-related variables and further controls, the following table contains the specific information on the operationalization.

Table 5-1: Operationalization of variables

The operationalization was pursued in accordance with the previous literature, as it can be extracted from the table. Human capital is operationalized in form of the formal qualification of the observation. Thereby, it is differentiated between no or low education, which implies an inadequate qualification or general elementary ← 133 | 134 → school, vocational qualification, and tertiary education. The latter one is the reference category since all variables are recorded as dummy variables. Concerning the variables on firm-specific human capital, previous literature has revealed that an indicator would be advisable in the context of this analysis. The indicator inhabits the following variables working time at the employer in years, the degree of autonomy in daily work and whether the individual inhabits a high position. A higher job position tends to be occupied by qualified professional or managerial staff, a high-level/executive, civil servant, or a foreman.

Equally weighted and additive indicators showed robust results over different specifications. The final models include an additive indicator since it reveals greater power. The usage of family-friendly human resource policies is recoded in a continuous variable indicating how many policies were used by the individuals. The list of policies includes institutions to keep contact during maternity leave, consultancy in general matters regarding childcare and the set-up of a women’s officer. Thereby, it is important to highlight that it is only included in tabulations when a mother is actually using it and not if it is merely available. Pre-tests have revealed that the same variable on the availability did never reveal significant results, while the usage does. Concerning the economic sectors, the list of sectors after NACE88 has been aggregated to the five categories public sector, manufacturing sector, knowledge-intensive sector, service sector and education and health sector. To be able to interpret them in a meaningful way, they have been recoded as dummies with education and health as the reference category. The decision of aggregating the sectors to these five is derived from the theoretical framework. The variables perceived as organizational support, perception of stress at work, the importance of gaining a reputation, and satisfaction with salary are subjective assessments to be ranked by the respondent. They are scaled from zero to ten with ten as the highest value. Perceived organization support focuses here on the direct support by the supervisor. The perception of satisfaction with childcare shall reflect the general attitude towards their childcare situation. While the dataset included more distinct variables on this topic (for instance satisfaction with quality in terms of children-teacher ratio), missing values did not support the decision of including them in the final model. Both variables flexible working arrangement and irregular working time have been recoded as dummies. Flexible working arrangements are marked by either perceiving a flexible working place or time. Irregular working time refers to either weekend work or shift work. Concerning the control variables, the age is included as a log. The marital status is aggregated of being either married or not married and may include widowed, single, or divorced individuals. The variables on income are included as logs, whereby pretests revealed whether the household or the partner’s income shall be included next to the personal income. The variables on living in East or West Germany, or individuals with a migration background may reflect cultural attitudes. Concerning variables on further children, the age of the youngest child is included as a dummy for having a child being three years or younger. The number of children is included as a continuous variable.

14 The parents’ money plus has not been in place during the data collection process.

15 In an email conversation, the contact person of the DIW for the FiD agreed that the demarcation of working hours on a monthly basis is not possible.