Instructions for LHCb Run Chiefs Instructions for LHCb Run - TopicsExpress

Mohamed Nawas

Instructions for LHCb Run Chiefs Instructions for LHCb Run Chiefs Useful Links Coffee Machine and Refurbishment Daily Morning Routine Checks (Pre-Run Meeting) Weekly Agenda of meetings affecting Operation Run Meeting Run News and Run Minutes Internal Plan of the Day Run Coordination Dashboard Run Summary Trigger Configuration Current Problems and Remedies Data Quality Beam Dump Luminosity Leveling LHCb Dipole Polarity Access Request Handling Handling safety incidents Report at the LHCb Tuesday meeting. Useful Links Daily Operation Report External link mark with an assembly of information from logbooks, problemDM, alarms etc per subsystem and per 24h compiled automatically at 7.00AM daily LHC activities of the week External link mark LHC Programme Coordinator with many useful links to other meetings and LHC information External link mark LHC Logbook External link mark LHCb Logbook External link mark LHCb Operation plots and tables of data taking performance and statistics External link mark LHCb ProblemDB External link mark The training (refresher) slides and shifter instructions are very helpful for Run Chiefs, . Please, go through them. You are also always invited to the Shift Leader trainings and refreshers. Training slides Instructions and Quick Helps for shifters Coffee Machine and Refurbishment The control room is equipped with a nice Expresso coffee machine based on the principle of honesty that each and every consumer leaves 0.50 CHF in the mug next to the machine per coffee. Everybody will appreciate enormlously if the Run Chief takes care of the refurbishment of coffee capsules ocassionally. They may be purchased fom the LHCb secretariate. They only accept Swiss Francs which always becomes a bit of a problem since a lot of people often leave Euro in the mug... Ocassionally we have to restart the collect with some injection of money unfortunately. Contact Run Coordinator for this. Daily Morning Routine Checks (Pre-Run Meeting) Below is a checklist of things to check daily, typically before the Run Meeting to prepare the Run Chief 24h summary. Daily Report External link mark: Read through the main events during the last 24h, more details to be found in the other tools below. Run Summary: Operation plots and statistics LHC page 1: Current state and operator comment LHCb Elog logbook ProblemDB DSS Alarm Screen DAQ Alarm Screen Histogram Alarms (on Histogram Presenter screen) Access Control (state and activities in cavern) After the meeting, update the LHCb IPoD Weekly Agenda of meetings affecting Operation For your information, this is a list of the weekly meetings where the operation of LHCb and LHC are discussed and from which you may either get or input information: Daily: LHC morning meeting (WD:8.30/WE:9.00): Chaired by the weekly Machine Coordinator. Attended by LHC contact person and/or Run Coordinator. Report on the status, activities and program for LHC, attended by the experiments mainly for information but ocassionally to answer to questions and make special requests. Urgent access requests are brought up there. Monday LHC Program Committee (~weekly at 14.00): Chaired by the LHC Program Coordinator. Attended by LHC Contact person and Run Coordinator and other expers needed for special topics. Scheduling meeting for the LHC experiments and discussion/presentation of special topics concerning all aspects of operation and performance. Long-term planning is discussed here for both running and accesses. LHC Background Study Group (~monthly at 15.00): Attended by official contacts on background (Gloria and Richard) and experts (Federico, Vladik, Plamen...). Discussion on instrumentation and simulation of experiment conditions, and optimization of the experiment conditions including needs and use of special runs. Tuesday LHCb Tuesday meeting (weekly at 14.00): News from management, LHC contact person, Run and Operation coordinator, and the Run Chief report! Wednesday LHC Machine Committee (weekly at 14.30): Official technical committee for the operation of LHC. Attended by LHC Contact person and Technical Coordinator, and occassionally Run Coordinator. Body taking the main decision about the global activities of LHC and major changes to the machine and experiments. Thursday LHCb Physics Planning Group (weekly at 9.30): Definition of the operational aim and requests for trigger and processing. LHCb Operation Planning Group (weekly at 16.00): Representation from Physics Planning group and from the whole operational chain. Round table discussion on progress of operation Online and Offline, urgent issues to be solved, and definition of operational strategy. Friday LHC Machine Protection Panel (weekly at 10.00): Attended by official contact (Richard). Discussion on machine performance against protection issues. Analysis of recent events and definition of strategy to handle these. Input to the restricted Machine Protection group taking decision on the beam intensity and beam energy Run Meeting The meeting starts at 9:15 on weekdays. We have no regular Run Meetings on weekends. In exceptional circumstances after Technical Stops, special fills or particular problems, Run Chief or Run Coordinator can call for Run Meeting on weekends like in the past, held at 9:30. The main meetings are Mondays, Tuesdays and Fridays, but during data taking periods a short meeting should also be held on Wednesdays, Thursdays. The short meetings are normally cancelled during Machine Development periods which lasts for a week or longer. The Technical Coordination takes over and chairs the Run Meeting during Technical Stops. Normally the Run Chief and Technical Coordination co-chairs the first meeting in a Technical Stop and the last meeting to make sure there is a proper take-over and return to normal data taking. EVO should be launched at every Run Meeting. Bookings of the Run Meeting and EVO is normally taken care of by the Run Coordinator. It is the responsability of the Run Chief to report on the last 24h at the beginning of each Run Meeting. The Run Summary with all run statistics can be exported from the Run Coordination PVSS dashboard. The idea is to inform everybody of the activities, mainly of LHCb as a function of the LHC activities, with particular emphasis on data quality as reported in logbook and ProblemDB, and technical problems and events which requires understanding by the subsystem experts. The idea is to stimulate a response from the experts during the round table. A short report from the LHC morning meeting, which is at 8.30 on weekdays and at 9.00 on weekends, is normally given by the LHC Contact Person or Run Coordinator. The report should concentrate on the LHC activities/problems which have or may have some impact or consequence for LHCb, and should be given in public language as much as possible. Normally the slides from the LHC morning meeting is attached to the agenda by the person attending the meeting but apart from the plan they are normally too detailed to be covered completely. They are mostly good as reference. Each meeting should end with a short summary of the plan for the next 24h or weekend. The following guidelines should be used to lead the meetings on the different days: Monday: Summary of the whole weekend and outlook for the whole week, both for LHCb and LHC. During the round-table the subsystem experts should comment on issues from the entire weekend and give some preliminary idea of needs for interventions/calibrations/local runs/problems to be solved during the week to come. Tuesday: Take-over meeting with the idea to pass on information about the current status of running of LHCb from one team to the next. Short summary by the Run Chief of the entire past week for both LHCb and LHC and outlook of the week to come with special empasis on issues to be worked on during the week. Friday: Summary of last 24h and outlook for the whole weekend, both for LHCb and LHC. A special topic on the agenda is Data Quality to discuss status of references in the Histogram Presenter and the entries in the ProblemDB. Often there may be a slide with a list of items from the Online Data Quality responsible. Wednesay,Thursday,Saturday,Sunday: Short 24h report from LHCb and LHC, with comments on special issues by the subsystem piquets. The daily Run Meeting agendas reflect these points. It is also very much appreciated if the Run Chief brings croissants for the Sunday meetings... The round table should cover: Velo ST OT Rich Calo Muon L0 HLT Luminosity and background Online Online Data Quality Offline processing and Data Quality Infrastructure A printable check list for the meeting is available here The Run Chief should take minutes during the meeting and post them on the [email protected] mailing list. When the Run Meeting is cancelled, it is highly appreciated if the Run Chief still prepares a Run News during the day and posts it on the mailing list to inform the collaboration on the state and activities of LHCb, and the progress of LHC. Run News and Run Minutes The minutes (referred to as daily Run News) of the Run Meeting should be posted on the [email protected]. They serve two purposes: Firstly they are meant to inform the whole collaboration on the status and plans of the operation of the detector. Secondly, they are the reference for people involved in the operation about the status of the data taking and particular actions to be taken and issues and problems to be solved. For these reasons, the Run News should start with the plans of the next 24h followed by the report from the last 24h, and then the short news from the LHC activities written in LHCb-wide public language, for instance with rare acronyms spelled out. The news from the round table may then contain details for the experts to understand. During the weekends when we have no Run Meeting, it is useful to send out a Virtual Run Meeting minutes with the plans, status, statistics, and summary of the daily report from the piquet to inform the operations team and the collaboration. It is important to check that there is no mistakes in names or facts that can confuse the collaboration. Be aware that you have to use the New button in the mailer to post to the Run News. If you use the reply to edit an old Run News it is attached to the previous Run News in a thread and not as a new entry. Also, please, put an explicit subject such as LHCb Run News April 7, 2011. It is useful to include a table of the run statistics as output from the Run Summary from the fill(s) during the last 24h in the Report from last 24h. Also, dont hesitate to include links to important presentations on the Run Meeting agenda. Run News items: Plans for the next 24h with a note on the time and day of the next Run Meeting, and change of Run Chief of Tuesdays Report from the last 24h including the major issues brought up during the round table in public language News from LHC VELO ST Etc As mentioned above, when the Run Meeting is cancelled, it is highly appreciated if the Run Chief still prepares a Run News during the day with the first three points and posts it on the mailing list to inform the collaboration on the state and activities of LHCb, and the progress of LHC. Internal Plan of the Day The Internal Plan of the Day is a short description of the activities of LHCb as a function of the LHC activities. it is meant as information for people working in the control room (piquet and central shifters). It is edited through a PVSS interface referred to as the Run Coordination Dashboard and is displayed on the PVSS panel on the big wall screen in the control room. This display also shows overall information about the state and current activity of LHCb to give everybody a clear idea of what can be done and not to be done. The IPOD is normally updated right after the Run Meeting by the Run Chief but should also be updated during the next 24h if there are major changes to the program. The date and time of the IPOD is automatically indicated just above the text field. Ex. IPOD - Physics data taking with dump at 14.00 - LHC injection studies in the afternoon --> LHCb in STATE MD - If 2h with no beam, call VELO for motion tests. - If LHC access of >3h, contact Run Chief Run Coordination Dashboard The Run Coordination Dashboard is a PVSS panel which may be opened from any console in the control room: START: Icon “Shortcut to Shortcuts38” --> LHC --> LHCCOM --> LHCCOM_UI_RunCoordination.pnl LINUX (from ui01 or plus): /group/online/ecs/Shortcuts38/LHC/LHCCOM/LHCCOM_UI_RunCoordination.sh PROBLEM: Call 160935 or 163730. It allows editing the Internal Plan of the Day (apply submits it immediately), get the run statistics and experiment condition information from the Experiment Condition Archive for any LHC fill (Run Summary), and control the luminosity leveling parameters. Run Summary The Run Summary feature of the Run Coordination dashboard allows you to generate predefined tables and trends of each fill or the last two days. The pages consist of (we are in processing of reviewing the information) Table with operational statistics Beam Monitoring page with timing information Background page inclduing intensity. Luminosity page Cumulative Inefficiency page to illustrate when serious problems occured affecting the data taking (very useful to check if all problems have been reported properly and that there are no major events left no understood) The pages will also display a list of the beam modes and at what time they changed. You may export the plots to files which end up on the scratch area: S:\everyone\LHCbAnalysisTool It is useful to attach the Run Summary to the agenda, at least the Luminosity and Innefficiency pages and any other illustrating a problem, and include the information in the Run News and the Tuesday meeting Trigger Configuration Current Problems and Remedies A page is available to the Shifters with a list of the current common technical problems and known remedies. It should be updated as soon as we become aware of a recurrent feature of the system which may require a standard action/recipe by the shifters during the time it takes to fix the problem. This may either be related to the central systems (ECS, DAQ, TFC, LHC communication, background/luminosity etc) or the subdetectors. The Run Coordinator and Run Chief normally update this page. We are aiming at getting a page per subdetector which is managed by the subdetector responsible for this to which we can link in the central page (In progress). The entry should contain an explicit title which allows the shifters to immediately find the remedy based on the diagnostics. It should have the date of entry, a description of the systems to identfy the problem and a short explanation of the problem, and the recipe to apply. The two latter two may consist of a link#anchor to the subdetector remedy page. Data Quality Data Quality problems, either spotted by the Data Manager or subdetector experts, should normally be reported in the ProblemDB. It is not always the case. It means that if a problem reported on in the Elog logbook or in the Run Meeting turns out to have a potential affect on the data quality offline, the Run Chief should make sure that an entry is put in the ProblemDB by the subdetector experts with the right explanation, plots attached, and attributes set. The most important point is to try to isolate a problems as much as possible to a run or a range of runs. For this reasons the Shift Leaders are instructed to make a manual fast Run Change as soon as a significant change in the subdetector performance or conditions are observed. It is the task of the Run Chief to look through the ProblemDB entries daily as well to make sure that they are either followed up, solved, and closed. For major problems, we should also make sure that the person representing the Offline at the Run Meeting is made aware. For problems with the references (missing or wrong) in the Histogram Presenter, the subdetector piquet should be informed. Also inform the Online Data Quality responsible (Olivier Callot) to make sure the problem is followed up. Also for all technical problems with the Histogram Presenter itself or the Problem Database, the Online Data Quality responsible should be informed. Beam Dump Protective beam dumps may come from over 10000 systems around the LHC ring including the experiments, and are always followed by a Post Mortem trigger distributed to all machine protection system to initiate a readout of history buffers for central processing. LHCb has three inputs: Beam Condition Monitor, VELO position, and LHCb magnet. Beam dumps by an experiment are serious events and require analysis and understanding before rearming LHC for beam again, in particular beam dumps due to high background are very complex due to the many elements and parameters involved and complicated to analysis. For these reasons, the INJECTION PERMITS are disabled until we have understood the source of the problem and we have some idea on how to improve the situation, typically in discussion with the LHC Engineer in Charge. Each experiment has a Beam Interlock Supervisor assigned to this task. Thus in all cases where LHCb triggers a beam dump, Richard should be called and in his absence Federico. They take care of informing the Run Chief and all other appropriate people. In case neither one is reachable, the Shift Leader is instructed to call the Run Chief. Since it is difficult to make the analysis, it is then up to him to inform the CCC with as much information as available and depending on the source of the dump take the actions described below. The Shift Leader is also instructed to contact LHC immediately to inform them that we are aware of the dump and that we are in the progress of analyzing. As soon as the beam dump has been analyzed, the Shift Leader is instructed to rearm the BCM if necessary. Rearming is done on the Big Brother screen. Below is a list of direct diagnostics tools to know the source of the dump: VELO position: If the VELO is in ANY OTHER position than OUT (as seen on Big Brother screen, LHCb/LHC Operational Overview, LHC Experiments VISTAR) in LHC beam modes which are NOT STABLE BEAMS and BEAM DUMP, it means that the VELO interlocked the beam. Contact also VELO expert. Magnet: If the power converter trips due to temperature or power cut etc (as seen on LHCb Magnet display from current and field), or the cooling fails (as seen on the temperature of IN/OUT water), then the magnet interlocked the beam. Contact also Magnet expert. BCM: A background level which exceeds one of the running sums in the BCM internal logic will lead to a dump. This will be announced by voice in the control room and Beam Dump flashes on the Background display. In addition, you may open the dedicated BCM display by clicking on BCM up in the top left corner of the Big Brother screen. The bottom Message box will indicate beam dump and the running sums and diamonds which fired. Most commonly this happens at injection for a variety of reasons (shaky SPS beam control or extraction, transfer line steering or collimation, injection kicker timing or failure, TDI injection absorber positioning, kicker and steering magnet settings in LHC etc). Although, LHCb is in a Safe State it may still involve a very heavy dose of radiation and should not be repeated. For this reason we look at all parameters above together with the 25ns structure of the background from the BLS to understand what is the real source of the problem and suggest this to LHC. However, it may also happen during Stable Beams which is a priori a lot more serious since LHCb is generally ON. We only have one important event in the past due to a UFO event in the triplet right of LHCb (downstream). Even more parameters are involved during circulation beams... You may get more information by opening the Post Mortem display in the BCM panel. That panel gives the current per diamon in the two stations for the different running sums. The eight diamonds are counting clockwise around the beam pipe as seen along the Z of LHCb. Upstream (U) is the station behind the VELO and downstream (D) is after the TT. Clicking on the actual post mortem trigger associated with the dump gives the detailed background trends per diamond and running sum as readout of the history buffers. Luminosity Leveling Luminosity leveling essentially means that LHCb defines its optimal/maximum luminosity real-time from a set of configurable and monitored parameters in a leveling function running in the LHCCOM PVSS project and transmits it to an optimization application running in the LHC control system which drives corrector magnets not far from the end of the IP8 Long Straight Section. The corrector magnets produce a parallel separation between the two beams in the vertical plane to adjust the luminosity. The configurable parameters may be changed using the Run Coordinator Dashboard Target beta*: Upgrade function, should be set to 3m permanently Limit Lumi [1E30]: This limit will depend on the detector stability and immunity to high particle flux from the pileup*rate plus background together with back splashes from the collisions. This may have to readjusted to improve on the operational stability, for instance if the detector has major HV trips frequently. We hope that well be able to take LHCb beyond 3E32 but it is still a hope. Limit Mu: This limit is essentially a hardware limit due to the output bandwidth of the readout boards (TELL1s and UKL1s) and the maximum bandwidth across the readout network. The system should be able to cope with a mu of 2.5 at 1 MHz but it also depends on other parameters. For instance, the limit may be effectively lower if there is a large spread in the mu per bunch. Limit L0 Rate [kHz]: By construction of the FE, the average L0 rate limit is at 1 MHz. ODIN will control this rate and will make sure there is never any buffer overflow in the system by throttling the trigger, effectively introducing deadtime. It is usally more beneficial to run with lower luminosity rather than deadtime which is why this number should be set in the leveling function and why this number may be lower in some special circumstances. Limit Thruput [GB/s]: Is the bandwidth limit across the readout network. 65 GB/s is said to be the hard limit which basically should allow a mu of 2.5 @ 1 MHz Limit Deadtime [%]: This is the acceptable limit on deadtime as introduced by ODIN to control the trigger rate. If this limit is exceeded the leveling application will rather try to lower the luminosity Disable/Enable Luminosity Leveling will do exactly what it says. You may see the current status of the parameters on both the Run Coordination Dashboard and the Luminosity Display in the control room. The Luminosity display will also indicate if leveling in terms of controling the corrector magnet is currently active or not.b The Luminosity display also allows you to track the vertical separation between the beams which should naturally decrease during the fill as the intensity drops and emittance grows. Effectively, the procedure is that the leveling will have a default offset at the start of fill when collapsing the separation bumps to start colliding the beams in order to have a safe luminosity when declaring STABLE BEAMS which if not controlled may have serious consequences for LHCb (deadtime, detectors tripping, HLT nodes filling up or blocking, problems to close VELO etc). VELO is closed in this situation and the first leveling to the optimized luminosity will take place right after the VELO has been closed. Once the target luminosity which satisfies all criteria has been reached, the LHC leveling application will re-optimize LHCb as soon as the relative drop of luminosity exceeds a threshold (5-10% equivalent to O(30)min with normal luminosity lifetimes) OR if we make major changed to the configuration parameters of the luminosity leveling. They may be changed on the fly. LHCb Dipole Polarity The LHCb dipole is controlled by the LHC together with the associated compensator magnets. Ocassionally we change the LHCb dipole polarity. The plan is usally discussed in the Operation Planning Group and in the LHC Program Committee. It is either the Run Chief, Run Coordinator or LHC contact person who will request the actual polarity change at the appropriate time between two fills. Note that the LHC convention is NOT to use the field direction (mostly...) but the actual polarity of the power converter. Unfortunately, positive power converter setting corresponds to a down field direction along the negative LHCb Y coordinate. Make sure you are clear about what you request. Easiest is usally to refer to a previous setting... the opposite of what we had yesterday... smile Access Request Handling Requests for access to the detector are mainly handled by the Run Chief. The detailed procedure is described here. The following steps and actions should be taken by the Run Chief up recieving a document. Normally, GLIMOS (Eric Thomas), Technical Coordinator (Rolf Lindner), RSO (Gloria Corti), and Run Coordinator should be informed of the access in order for them to see if there are some constraints. If they are not on the initial mail, forward the access request to them Check the access request for correct information Added it to the list of pending requests on the TWiki page. The most important point is to make sure the request is passed to the LHC Contact Person or Run Coordinator. Once LHC gives the possibility of access, you act as Access Supervisor as described in the note from the LHC Program Coordinator note. This means controlling who is entering the zone, and making sure noone bypasses the rules, in particular passing through the MAD. The LHCb RPE (Gloria or Eric) should be present to open the zone for access. The RPE or the Shift Leader normally asks for the access to be delegated to us. The access control delegation should be released (Call to CCC) when the accesses are finished. You should also make sure the access report is produced and filed appropriately. The access procedure instructions for Shift Leaders are explained here. The Run Chief should print the Access Requests to give them to the Shift Leader so that it is clear to him which are the granted accesses and the people intervening. Handling safety incidents The Run Chief should be called by the shift crew immediately upon a major incident: serious Detector Safety System event, power cut, Level 3 Alarm, etc. The role of the Run Chief is to coordinate the activities, to make sure the actions are logged and that all problems are addressed. Make sure the GLIMOS and Run Coordinator are informed. Please, read the Shift Leader/SLIMOS instruction on the handling of DSS alarms, power cut, and Level 3 Alarm. The above description of handling of DSS Alarms is a revised version of the official instructions External link mark. Report at the LHCb Tuesday meeting. At the end of the duty period, the Run Chief will be usually asked to make a summary of his one or two weeks period at the LHCb Tuesday meeting. This an opportunity to explain the status of the running, the experiment conditions, and inform the collaboration about the status of the trigger and the amount of data recorded in which conditions. In particular it allows explaining problems and issues which may have an effect on the data quality or data processing, and the actions taken. It is also an opportunity to raise worries which should be understood and addressed by the whole collaboration. The report may also include the LHC activities and progress which in one way or another has positive/negative consequences for LHCb. It is advicable to contact the Offline data processing team to get information and news about the processing and data quality to include in the presentation. -- RichardJacobsson Attachments I Attachment Action Size Date Who Comment PDF Operation_checklist.pdf manage 7.9 K 2010-07-30 - 14:57 RichardJacobsson Edit | Attach | PDF version | Print version | | Raw View | Raw edit | Backlinks | History: r19 < r18 < r17 < r16 < r15 | More topic actions Topic revision: r19 - 2013-08-06 - RichardJacobsson

Posted on: Sat, 09 Aug 2014 12:33:57 +0000

Instructions for LHCb Run Chiefs Instructions for LHCb Run - TopicsExpress

Trending Topics

Recently Viewed Topics