Reviewer #1 (Remarks to the Author): In this manuscript, the authors report the coupling of a few-photon light field to a single atom in free-space making use of a "4pi" lens configuration, as known from high-resolution imaging. The achieved extinction of probe photon transmission due to scattering from the single atom reaches 37%, a value comparable to state-of-the-art waveguide systems. The presented results are an interesting step forward in the domain of freespace-QED. Both the experiment as well as the analysis and comparison to theory are nicely presented and interesting, I only have a few comments which are listed below. I think this work deserves publication in Nat. Comm. after the authors have considered these comments. * I think the very limited length of the manuscript stems from previous restrictions. As I understand it in NatComm these restrictions are much less severe. I would highly appreciate some more detail at various places in the manuscript. Most importantly to me are the presentations of eq. 1 and 2, which now are simply given together with references. A short overview of the derivation and/or somewhat more detailed discussion would be very useful to a reader not extremely familiar with the topic. * I find the repeatedly used phrase "weak probe beam" somewhat unclear. What exactly is meant by weak here? That the effective coupling strength (per time) is small compared to the excited state decay rate \Gamma? This more quantitative definition should be particularly important for the validity of eq. (2). In case of the coupling strength exceeding the decay rate, the single TLS should undergo Rabi oscillations, as recenlty observed for a Rydberg superatom in https://arxiv.org/abs/1705.04128. The theory presented there gives the full expression for g2(t1,t2) for any coupling strength. Eq. (2) should be valid in the "weak coupling regime". * The agreement between theory and experiment in Fig. 3 is not too convincing (it is called "qualitative" in the text). The authors seem to understand their system very well, as e.g. shown by the nice analysis of the g2 data. Why does this (more basic) measurement of the transmission not agree so well? A bit of discussion would be good here. * I do not fully understand the discussion of the "single-photon converter" at the end of the manuscript, reachable with a stronger coupling. Do the authors mean that for stronger coupling a single photon is reflected while all others are transmitted? Such a "single-photon subtractor" has recently been demonstrated with a single atom couped to a nanofiber (Nature Photonics 10, 19-22), although the physics of this system is slightly different as it makes use of 3 atomic levels and the polarization dependence of the light in the fiber. A somewhat more detailed discussion how this scheme would work for the single TLS would be nice. * A question in connection to these nanofiber systems: with the tight focussing of the light, to what extend is the field still transverse? Or does spatially varying polarization already play a role here, as in the (sub-wavelength) fiber-systems? Will that be more relevant for the future higher NA system mentioned in the outlook? Reviewer #2 (Remarks to the Author): In their manuscript “Nonlinear photon-atom coupling with 4Pi microscopy” the authors implement a technique known from high-resolution imaging to increase the atom-light coupling. Principle of this technique is to illuminate the atom from opposite directions such that the driving field forms a standing wave. Then, the atom-photon coupling is strongly increased if the atom is held in an antinode of the driving field. The authors observe an extinction of the driving field by about 36 %, a value that has not been reached before. Furthermore, they demonstrate in a photon-correlation measurement that the nonlinear interaction between atom and photon leads to a modified photon statistics in the transmitted field. The experimental claims are convincing, the manuscript well written and the representation of the experimental results clear. The results are important for other groups and will certainly motivate them to increase their emitter-photon coupling by using this technique. I recommend publication in Nature Communications. However, there are a few (minor) points the authors should address before publication: - For easier comparison of Fig. 1 b/c with d/e, it would be nice if the authors give the NA of their lenses in the figure caption. It is only given in the Supplementary Information. - If space constrains allow it, it would be helpful for the general reader to explain the terms of Eq. 1 intuitively. - There must be mistake in the optimal power splitting, it must read P_{2,in}= P_{1,in}\Lambda_2/\Lambda_1 - Can the authors comment on how the total coupling \Lambda_{total} depends on the chosen post selection? Could they increase the measured value by an even stricter post selection, or is the residual difference due to atomic motion in the 1D lattice? Could the value reach the theoretical limit if a 3D lattice would be used? - It does not become clear to me why larger values of \Lambda will lead to photon bunching in the transmitted field. Could the authors explain this a bit more? - Is a value of \Lambda=0.25 within experimental reach? What NA lenses would be necessary? - The authors motivate their study with the prospect of deterministic all optical quantum logic. Here, I see a weak point in the 4Pi technique: one has to use beam splitters to separate the input mode from the output mode. Therefore, there is always a trade of between the fraction of input light that is sent towards the atom and the fraction of light that can be detected. In an all-optical quantum processor, it might be important to not lose input light as well as output light. Have the authors thought about this problem? Have they ideas how to deal with it? It would be great if they could include a short discussion on this in the outlook. - Methods: The post selection process does not become clear only from this paragraph, I could only understand it after reading the SI. It is not mentioned here, that the transmission in the second interval must be below a certain value, which signals good photon-atom coupling, in order to take data in the first interval into account. - SI: on page 3, instead of referencing to Fig3, they have to reference to Fig. 4. This mistake has been made twice. - SI: normalization of g^{(2)}(\tau). It seems to me that the normalization of each interval makes the data quite noisy, since there are only about 150 counts per interval with an error of +/-12 due to counting statistics. I expect that the fluctuations due to movement of the optical lattice are slow, so maybe it would make more sense to normalize about 20 intervals together I order to reduce noise? Have the authors considered this? Reviewer #3 (Remarks to the Author): The manuscript "Nonlinear photon-atom coupling with 4pi microscopy" reports on experiments where light is transmitted through/by a single atom held at the tight focus of the laser beam. The main findings are that double-sided illumination decrease the transmission significantly, and that the transmitted light show sub-Poissonian photon statistics. I consider these important experimental demonstrations in the field, and well supported by the data presented, indicating that the conclusions are valid. The findings are clearly beyond statistical fluctuations and further supported by comparison to simple theoretical predictions. In the case of the sub-poisonian photon statistics the "unprocessed data" is also shown in the supplementary information, which removes any doubt that conclusions are drawn based on artifacts from the data-processing. The topic is of high interest to the scientific community and will definitely be of interest outside its specific field. The manuscript is w ell written in clear English. I enjoyed reading it. I am therefore happy to recommend publication in Nature Communications. I have few minor comments: 1. Where the light-atom coupling efficiency \Lambda is introduced, I did not understand its definition without looking in Ref 15. In particular what the maximal possible amplitude referred to. As \Lambda plays a central role in the paper it would be good to clarify this. 2. As far as I can see, it is only said in the supplementary material, that for one-sided illumination, the postselection procedure does not change the observed transmission. This is a crucial point for interpretation of the main data, so I suggest that this is stated in the main text or methods. 3. The caption of Fig. 2 is a bit misleading. I presume that red and blue in b refers to the two detectors rather than illumination paths as is indicated.